List of topics covered in other software development for scientist courses
https://www.cgl.ucsf.edu/Outreach/bmi219/ (somewhat minimal)
Aalto Science-IT "Scientific computing in practice" series: http://science-it.aalto.fi/scip/
J.P. Onnela tutorials on using Python for network analysis http://www.jponnela.com/resources/
Software Carpentry (v5)
- Introduction - why important
- Unix shell: files/directories, creating/editing, piping, loops, shell scripts, find
- git: intro, backup, collaborating, conflicts, open science (licensing, hosting)
- Python: ..., defensive programming (assertions, test-driven development, debugging, ), command line programs)
http://software-carpentry.org/v5/rules.html (not sure if in Python section)
- Databases and SQL
- Extras: branching in git, forking a repository, code review, man pages, permissions, shell variables, working remotely (ssh), Python exceptions, computer data types.
- Instructor's notes (good advice here)
Software Carpentry (v4)
- The Unix shell
- Sets and dictionaries (why dicts/sets are good instead of just arrays all the time)
- Regular expressions
- Using Microsoft Access
- Data management tips
- Object Oriented Programming
- Program design (introduction, "the grid" (sample problem), aliasing, randomness, neighbors, handling ties, assembly, bugs, refactoring, testing, tuning)
- Those listed topics are from an example problem about invasive percolation, the actual lessons are hidden within that.
- Systems programming (browsing directories, browsing directories using walk, querying directory contents, directory and file paths, manipulating files and directories)
- Matrix programming with numpy
- Multimedia programming
- Software engineering (importance of empirical results on the topic, agile development, sturdy development (opposite of agile, planning up front, scheduling, prioritizing bugs), principles of computational thinking)
- Principles of computational thinking
- It's all just data
- Data doesn't mean anything on it's own - it has to be interperted
- Programming is about creating and composing abstractions
- Models are for computers, views are for people
- Paranoia makes us productive
- Better algorithms are better than better hardware
- The tool shapes the hand.
- Principles of computational thinking
- Essays: How to pass configuratino options?, Provenance (how to track where data/results came from), counting things, persistance/saving things.
Software carpentry (v3)
https://github.com/swcarpentry/v3 (see index.html for topics)
- python strings, lists, files
- python functions/modules
- version control
- python sets and dictionaries
- image processing
- basic unix shell
- advanced unix shell
- automated builds
- python basic OOP
- Python advanced OOP
- quality assurance
- unit testing
- regular expressions
- binary data
- gui programming
- web client programming
- web application programming
- expirical software engineering
- software development lifecycles
UW High performance scientific computing (coursera)
- Working at the command line in Unix-like shells (e.g. Linux or a Mac OSX terminal).
- Version control systems, particularly git, and the use of Github and Bitbucket repositories.
- Work habits for documentation of your code and reproducibility of your results.
- Interactive Python using IPython, and the IPython Notebook.
- Python scripting and its uses in scientific computing.
- Subtleties of computer arithmetic that can affect program correctness.
- How numbers are stored: binary vs. ASCII representations, efficient I/O.
- Fortran 90, a compiled language that is widely used in scientific computing.
- Makefiles for building software and checking dependencies.
- The high cost of data communication. Registers, cache, main memory, and how this memory hierarchy affects code performance.
- OpenMP on top of Fortran for parallel programming of shared memory computers, such as a multicore laptop.
- MPI on top of Fortran for distributed memory parallel programming, such as on a cluster.
- Parallel computing in IPython.
- Debuggers, unit tests, regression tests, verification and validation of computer codes.
- Graphics and visualization of computational results using Python.
- the shell
- storing large arrays
- data types
- version control, bitbucket, git, gitk
- demo developing a sqrt function, test functions, nose
- python concepts, objects/methods, lists, tuples
- ipython notebook
- timing python code
- python debugging
- fortran: functions/subroutines, arrays, memory management,
- multi-file codes and makefiles
- computer anchiteture and memory optimization: measuring performance, memory hierachy, reducing latency, caches, basic optimization considerations
- compiler optimization
- parallel computing
- amdahl's law
- thread safety, pure functions
- fine vs course grained parralization
- lapack/blas and linear algerbra
- cray supercopmuters and PGAS (partitioned global address space)
- monte carlo methods
- JIT compilers for python
- i/o, ascii vs binary, HDF, NETCDF
- summary and take-away messages
= Aalto Science-IT "Scientific computing in practice" kickstart 2014