Body
Maintenance
With the development of any software artifact, the key consideration to implementation should be maintenance. Many research scientists tend to think of their software products as unique tools that will not be around long term and thus don't think about long term maintenance when in the development phase. While estimates of maintenance costs as percentage of total cost of ownership vary, the consensus is that software maintenance costs are large and increasing \cite{Glass2001,Koskinen2015,Dehaghani2013}; some put maintenance at 90% of total software cost. <reference about long lived nature of research software?>
There are many techniques that can help to reduce cost of maintenance and speed development time. Here are some that are important but need further attention:
Documentation
Literate programming <donald knuth> - 2 aspects weaving & tangling both from the same source file so documentation and binary code are kept together.
Weaving creates a document that describes the software and facilitates maintenance
Tangling produces a machine executable image
In R this can be accomplished with roxygen2 \cite{Wickham2017}
Language Choice
Prefer higher level languages to lower level languages
Software Testing
Unit Testing
End to end testing - It is important to create tooling to help validate functionality beyond basic unit tests.
Software Optimization
Key aspects of software optimization:
- Identify performance target
- Big O Notation - https://justin.abrah.ms/computer-science/big-o-notation-explained.html
- Code Profiling
- R
- C++
- When faced with multiple options to solve same problem, use microbenchmark. Some examples of what is faster – prefer matrix over data.frame, don’t use “::”,an env with no parent environment is about 50x faster than one with a parent env, etc.