Coursera-dl: A Coursera download script

I have blogged about coursera.org in the past and as part of signing up to a number of courses I felt the need to easily download the videos, quizzes, notes, etc. locally for later use offline.

I quickly found a project on github (and there are a few) but wasn’t quite happy with the code. I cleaned it up to a relatively sensible state and it now does what I wanted it to do. The main additional features I wanted were: easily download multiple courses, support for quizzes/homeworks, and support for links to extra material (e.g, 3rd party sites, papers, etc).

Just do a “pip install coursera-dl” and then run as follows:

coursera-dl -u myusername -p mypassword -d /my/courses/ algo-2012-001 ml-2012-002

Code is in python and can be found on Github.

Some people have asked if they could donate something. If you wish you can do that here:

Donate Button

Update: if you have a feature request or want to report a bug please use the github issue system

–Dirk

87 thoughts on “Coursera-dl: A Coursera download script

  1. Hi I am trying to use your script to download scientificcomputing course and I can’t get past the following error. Do you have any idea what is generating this exception? The system seems to collect all the information but then throws exception during download. I am running your script on Windows 64.

    Trace

    N:\Coursea\testnewscr>python.exe coursera-dl.py -u Jacob@****.com -p 123456 scientificcomp-2012-001
    * Authenticating…
    * Collecting downloadable content from http://class.coursera.org/scientificcomp-2012-001/lecture/index
    * Got all downloadable content for scientificcomp-2012-001
    * scientificcomp-2012-001 will be downloaded to N:\Coursea\testnewscr\scientific comp-2012-001
    – Downloading lecture/syllabus pages
    Traceback (most recent call last):
    File “coursera-dl.py”, line 342, in
    d.download_course(cn,dest_dir=args.target_dir)
    File “coursera-dl.py”, line 143, in download_course
    self.download(course_url,target_fname=os.path.join(target_dir,”lectures.html
    “))
    File “coursera-dl.py”, line 124, in download
    print “Failed to download url %s to %s: %s” % (url,folder,e)
    NameError: global name ‘folder’ is not defined

    • Hi Rod, do you have the scientific computing lectures after week 3, I have followed the course late and now the lectures are removed from the site and wont come till a few weeks in the new version of the course.. Could you please share the material ? Thanks a lot in advance

      • That should definitely not be necessary, everything works as expected here. So to be clear, you invoke the script with your correct username/pwd as cmdline arguments, the script authenticates, downloads the video page and then terminates without error? If so please file an issue on github with all the output.

        Note though that you have to manually accept the honor code for a class before you can use the script to download anything from it. Accepting the honor code happens the very first time to go to the class page.

  2. This causes invalid syntax in setup.py
    requires=[‘argparse’, ‘beautifulsoup4′]

    As I have never used python before, elaborate explanation would be appreciated very much.

  3. Also I am having difficulty using the script courser-dl.py

    Would u please explain to me that too how to put the arguments for scientific computing course on coursera.

  4. Thanks for this Dirk, it’s really useful. For what it’s worth the script worked for me on Python 2.6.6 once I installed argparse (which pip didn’t know was a requirement).

  5. Nice program.

    One comment, the program does not seem to note whether a user used the correct passwd or not, result being that if the passwd is incorrect, only a few files will be downloaded w/o any error. Is there a way to make it exit w/ “wrong passwd”?

  6. Thanks for your work.

    I got this error:
    Traceback (most recent call last):
    File “c:\Python27\Scripts\coursera-dl-script.py”, line 8, in
    load_entry_point(‘coursera-dl==1.1.4′, ‘console_scripts’, ‘coursera-dl’)()
    File “c:\Python27\lib\site-packages\courseradownloader\courseradownloader.py”,
    line 397, in main
    d.download_course(cn,dest_dir=args.target_dir)
    File “c:\Python27\lib\site-packages\courseradownloader\courseradownloader.py”,
    line 236, in download_course
    if not os.path.exists(dirName): os.makedirs(dirName)
    File “c:\Python27\lib\os.py”, line 157, in makedirs
    mkdir(name, mode)
    WindowsError: [Error 123] ╤шэЄръёшўхёър  ю°шсър т шьхэш Їрщыр,: u’03 – Python as
    a Calculator (1035)\n\u2710 Quiz Attempted’

  7. Thanks for your work.

    I got this error:
    Traceback (most recent call last):
    File “c:\Python27\Scripts\coursera-dl-script.py”, line 8, in
    load_entry_point(‘coursera-dl==1.1.4′, ‘console_scripts’, ‘coursera-dl’)()
    File “c:\Python27\lib\site-packages\courseradownloader\courseradownloader.py”,
    line 397, in main
    d.download_course(cn,dest_dir=args.target_dir)
    File “c:\Python27\lib\site-packages\courseradownloader\courseradownloader.py”,
    line 236, in download_course
    if not os.path.exists(dirName): os.makedirs(dirName)
    File “c:\Python27\lib\os.py”, line 157, in makedirs
    mkdir(name, mode)
    WindowsError: [Error 123] ╤шэЄръёшўхёър ю°шсър т шьхэш Їрщыр,: u’03 – Python as
    a Calculator (1035)\n\u2710 Quiz Attempted’

  8. Mmm, tested here on osx & linux, works perfectly. Dont have a windows machine handy at the moment. Have made a tiny change and updated the package so see if that helps. If you still get a crash add “print dirName” on line 235 and let me know what the output is. Also, I would prefer you handle this via creating an issue on github.

    • Sorry, but I don’t have “Issues” section for this project…

      Script is trying to create directory with ‘\’ symbol. It’s can’t be done in Windows…

      • Made another change, double checked and it now runs without error on windows for me here. Also enabled issues on github. Let me know if it all works.

  9. Great script, thanks…exactly what I needed!
    Unfortunately I get an auth exception

    raise Exception(“Failed to authenticate as %s” % (self.username,))
    Exception: Failed to authenticate as MY@EMAIL.COM

    I was trying to alter the script since the coursera login page might have changed…any ideas about this? Is anyone else experiencing this issue?

      • Hey thanks for the quick response. reinstalled using pip, my password had a $ not sure if that matters…anyway i got compinvesting1-2012-001 :) awesome.

        I got an error when downloading compfinance-2012-001 though…. (could you give me some hints on how to hack it??)

        vurl = bb.find(‘source’,type=”video/mp4″)[‘src’]
        TypeError: ‘NoneType’ object is not subscriptable

        Thanks

      • Try again, it should no longer crash now. However, it may not download all videos (something I may or may not be able to do something about). Please create an issue about this on github.

  10. hi!

    can you make optional download everything exept video (i’m already downloaded them before manually) ? I’ll tried to find a line in code to comment for skipping, but no luck

    and optional download all “Previous Attempts” (Review) in quiz (just for archive after end of the course) ?

    thank you very much !

  11. Hey, thanks for sharing this code.
    I have a request for you..
    In our university they have a limit over bandwidth per download though we can download in parallel.like limit of 200Kbps per download though i can run many downloads in parallel (ie say 10 parallel downloads with 200kbps limit per download(limit of 200*10kbps for 10 parallel downloads ) , So can you provide us a version where downloads run in parallel ?
    I personally use gnu-parallel along with wget to achieve the same.

  12. Pingback: Learning, Doing, Talking. Whats your balance? | Dirk's Page

  13. Traceback (most recent call last):
    File “/usr/local/bin/coursera-dl”, line 8, in
    load_entry_point(‘coursera-dl==1.1.12′, ‘console_scripts’, ‘coursera-dl’)()
    File “/Library/Python/2.7/site-packages/courseradownloader/courseradownloader.py”, line 452, in main
    d.download_course(cn,dest_dir=args.dest_dir)
    File “/Library/Python/2.7/site-packages/courseradownloader/courseradownloader.py”, line 195, in download_course
    os.mkdir(course_dir)
    OSError: [Errno 2] No such file or directory: ‘/my/courses/db’

  14. Thanks a lot for this creation..myself being a non-programmer, i need more clarification.. can you clarify how to use your .py files in windows …i am just running them in python and that does not create the reqd script file in my python27/scripts folder.. i also ran them in aptana studio->pydev thinking that it will run as a whole package in it but no use.. there is some basic flaw in my attempt.. please tell stepwise- your post shall be then helpful to a lot of people

  15. This is great. Your script works perfectly for me. I am trying to learn python so your code serves as a great learning tool for me to learn python coding as well. If you don’t mind me asking, how many years of python programming experience do you have to be able to write a professional and quality program like this ? Thanks a lot.

    • Thanks, but I wouldn’t quite call it professional quality code yet :) It still carries with it the structure of the code I forked it from and there are always some things you can improve. I only have about 3 full time years of Python experience.

  16. Hi Dgorissen
    Thanks Again, it working fine, one thing I didn’t understand that in download command what is the last word(ml-2012-002) referring ???..
    I think last second word is course name , but last one I am not able to collect..

    coursera-dl -u myusername -p mypassword -d /my/courses/ algo-2012-001 ml-2012-002

  17. Hi Dgorissen
    I tried to download ‘compfinance-002′ contents,
    it only download video’s , ppt , srt ,index page, but it didn’t download assignment and exercises.
    is there any problem with script?

    Thanks
    Gaurav

  18. Thanks so much for the script. I have used it before successfully. But, when I tried it today, it isn’t working with any of the courses. It just downloads the index.html and lectures.html and stops.
    I don’t get any error. Any idea?

    Output –
    Warning: lxml not available, falling back to built-in ‘html.parser’ (see -q opt
    on), this may cause problems on Python < 2.7.3
    HTML parser set to html.parser
    * Authenticating as vishwesh99@yahoo.com
    * Already logged in
    * Collecting downloadable content from http://class.coursera.org/compinvesting1
    002/lecture/index
    * Got all downloadable content for compinvesting1-002
    * compinvesting1-002 will be downloaded to C:\DELETE\compinvesting1-002
    – Downloading lecture/syllabus pages

  19. Hey thanx for the script, its awesome..
    nlangp-001 this particular course is not working, I get errors but others are working.

  20. Hi , thanks for your great work. I tried to install using “pip install coursera-dl”
    but i got this result:

    Downloading/unpacking coursera-dl
    Running setup.py egg_info for package coursera-dl
    Traceback (most recent call last):
    File “”, line 3, in
    ImportError: No module named setuptools.command
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

    File “”, line 3, in

    ImportError: No module named setuptools.command

    —————————————-
    Command python setup.py egg_info failed with error code 1
    Storing complete log in ./pip-log.txt

    any comment how to solve this?

  21. Can your script resume incomplete downloads or is it possible to specify the sections to skip downloading?

    • It wont download an existing file again but you cant skip sections or resume incomplete downloads. It would be great to have those features but I dont have the time to add them unfortunately.

  22. I realised that I did not download the course while it was active :( Do they keep them on for later. Thanks for your script it is incredible!

  23. Pingback: Downloading courses from coursera | Raony Guimarães

  24. hi,

    Thanks for nice script.
    I was downloading a class and due to some error midway it stopped. Now if start the script again, it will start from ver beginning and redownload what has been downloaded.

    Could you please suggest if there is a way to start download forward from where error occured.

    Thanks

  25. is there anyway to get it to only download the video lectures?

    Kudos on an awesome contribution to a project that I don’t believe gets enough credit for what it has done!

  26. Hi I was able to use the code to download some courses. But it didn’t work on other courses. Keep asking me for honour code which I did accept and I was able to watch the videos. Not sure why. It happened to at least 5 or 6 courses I tried. Any insight?
    Course 1 of 1
    * Collecting downloadable content from https://class.coursera.org/digitalmedia-001/lecture/index
    Warning: no downloadable content found for digitalmedia-001, did you accept the honour code?

  27. Question from a first time python user – where do you save the file so when you run
    pip install coursera-dl
    python finds the file? I’m getting “syntaxerror”
    Additionally is python 3.3 supported?

  28. Hello, this is great script :-) I have read that downloading quizzes in disabled due to

    https://github.com/dgorissen/coursera-dl/issues/2

    but is there a possibility to add a switch in args that will enable downloading quizzes?
    I am downloading some already finished courses and it is not a problem for me that downloading will increase attempt counter. I suppose it may be difficult for all courses but I think it could work with courses that has simple page with “Start quiz now” button. This option would be very helpful.

    • Please comment on the github issues, dont use this page. Its not trivial to get this completely right and I unfortunately dont have enough time to make this a priority im afraid. Will consider any pull requests though/

  29. Thanks to this handy script, I have downloaded over 300 GB of lectures from over 75 courses.
    Thanks a lot, friend. Keep making such useful time-saving scripts.
    Lots of blessings from all those who benefited.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s