What is ‘Software Quality Assurance’?
What is ‘Software Testing’?
What are some recent major computer system failures caused by software bugs?
Why is it often hard for management to get serious about quality assurance?
Why does software have bugs?
How can new Software QA processes be introduced in an existing organization?
What is verification? Validation?
What is a ‘walkthrough’?
What’s an inspection’?
What kinds of testing should be considered?
What are 5 common problems in the software development process?
What are 5 common solutions to software development problems?
What is software ‘quality’?
What is ‘good code’?
What is ‘good design’?
What is SEI? CMM? ISO? Will it help?
What is the ’software life cycle’?
Will automated testing tools make testing easier?
What is ‘Software Quality Assurance’?
Software QA involves the entire software development PROCESS - monitoring and improving the process, making sure that any agreed-upon standards and procedures are followed, and ensuring that problems are found and dealt with. It is oriented to ‘prevention’.
What is ‘Software Testing’?
Testing involves operation of a system or application under controlled conditions and evaluating the results (eg, ‘if the user is in interface A of the application while using hardware B, and does C, then D should happen’). The controlled conditions should include both normal and abnormal conditions. Testing should intentionally attempt to make things go wrong to determine if things happen when they shouldn’t or things don’t happen when they should. It is oriented to ‘detection’. Organizations vary considerably in how they assign responsibility for QA and testing. Sometimes they’re the combined responsibility of one group or individual. Also common are project teams that include a mix of testers and developers who work closely together, with overall QA processes monitored by project managers. It will depend on what best fits an organization’s size and business structure.
· In August of 2003 a U.S. court ruled that a lawsuit against a large online brokerage company could proceed; the lawsuit reportedly involved claims that the company was not fixing system problems that sometimes resulted in failed stock trades, based on the experiences of 4 plaintiffs during an 8-month period. A previous lower court’s ruling that “…six miscues out of more than 400 trades does not indicate negligence.” was invalidated.
· In April of 2003 it was announced that the largest student loan company in the U.S. made a software error in calculating the monthly payments on 800,000 loans. Although borrowers were to be notified of an increase in their required payments, the company will still reportedly lose $8 million in interest. The error was uncovered when borrowers began reporting inconsistencies in their bills.
· News reports in February of 2003 revealed that the U.S. Treasury Department mailed 50,000 Social Security checks without any beneficiary names. A spokesperson indicated that the missing names were due to an error in a software change. Replacement checks were subsequently mailed out with the problem corrected, and recipients were then able to cash their Social Security checks.
· In March of 2002 it was reported that software bugs in Britain’s national tax system resulted in more than 100,000 erroneous tax overcharges. The problem was partly attributed to the difficulty of testing the integration of multiple systems.
· A newspaper columnist reported in July 2001 that a serious flaw was found in off-the-shelf software that had long been used in systems for tracking certain U.S. nuclear materials. The same software had been recently donated to another country to be used in tracking their own nuclear materials, and it was not until scientists in that country discovered the problem, and shared the information, that U.S. officials became aware of the problems.
· According to newspaper stories in mid-2001, a major systems development contractor was fired and sued over problems with a large retirement plan management system. According to the reports, the client claimed that system deliveries were late, the software had excessive defects, and it caused other systems to crash.
· In January of 2001 newspapers reported that a major European railroad was hit by the aftereffects of the Y2K bug. The company found that many of their newer trains would not run due to their inability to recognize the date ‘31/12/2000′; the trains were started by altering the control system’s date settings.
· News reports in September of 2000 told of a software vendor settling a lawsuit with a large mortgage lender; the vendor had reportedly delivered an online mortgage processing system that did not meet specifications, was delivered late, and didn’t work.
· In early 2000, major problems were reported with a new computer system in a large suburban U.S. public school district with 100,000+ students; problems included 10,000 erroneous report cards and students left stranded by failed class registration systems; the district’s CIO was fired. The school district decided to reinstate it’s original 25-year old system for at least a year until the bugs were worked out of the new system by the software vendors.
· In October of 1999 the $125 million NASA Mars Climate Orbiter spacecraft was believed to be lost in space due to a simple data conversion error. It was determined that spacecraft software used certain data in English units that should have been in metric units. Among other tasks, the orbiter was to serve as a communications relay for the Mars Polar Lander mission, which failed for unknown reasons in December 1999. Several investigating panels were convened to determine the process failures that allowed the error to go undetected.
· Bugs in software supporting a large commercial high-speed data network affected 70,000 business customers over a period of 8 days in August of 1999. Among those affected was the electronic trading system of the largest U.S. futures exchange, which was shut down for most of a week as a result of the outages.
· In April of 1999 a software bug caused the failure of a $1.2 billion U.S. military satellite launch, the costliest unmanned accident in the history of Cape Canaveral launches. The failure was the latest in a string of launch failures, triggering a complete military and industry review of U.S. space launch programs, including software integration and testing processes. Congressional oversight hearings were requested.
· A small town in Illinois in the U.S. received an unusually large monthly electric bill of $7 million in March of 1999. This was about 700 times larger than its normal bill. It turned out to be due to bugs in new software that had been purchased by the local power company to deal with Y2K software issues.
· In early 1999 a major computer game company recalled all copies of a popular new product due to software problems. The company made a public apology for releasing a product before it was ready.
· The computer system of a major online U.S. stock trading service failed during trading hours several times over a period of days in February of 1999 according to nationwide news reports. The problem was reportedly due to bugs in a software upgrade intended to speed online trade confirmations.
· In April of 1998 a major U.S. data communications network failed for 24 hours, crippling a large part of some U.S. credit card transaction authorization systems as well as other large U.S. bank, retail, and government data systems. The cause was eventually traced to a software bug.
· January 1998 news reports told of software problems at a major U.S. telecommunications company that resulted in no charges for long distance calls for a month for 400,000 customers. The problem went undetected until customers called up with questions about their bills.
· In November of 1997 the stock of a major health industry company dropped 60% due to reports of failures in computer billing systems, problems with a large database conversion, and inadequate software testing. It was reported that more than $100,000,000 in receivables had to be written off and that multi-million dollar fines were levied on the company by government agencies.
· A retail store chain filed suit in August of 1997 against a transaction processing system vendor (not a credit card company) due to the software’s inability to handle credit cards with year 2000 expiration dates.
· In August of 1997 one of the leading consumer credit reporting companies reportedly shut down their new public web site after less than two days of operation due to software problems. The new site allowed web site visitors instant access, for a small fee, to their personal credit reports. However, a number of initial users ended up viewing each other’s reports instead of their own, resulting in irate customers and nationwide publicity. The problem was attributed to “…unexpectedly high demand from consumers and faulty software that routed the files to the wrong computers.”
· In November of 1996, newspapers reported that software bugs caused the 411-telephone information system of one of the U.S. RBOC’s to fail for most of a day. Most of the 2000 operators had to search through phone books instead of using their 13,000,000-listing database. The bugs were introduced by new software modifications and the problem software had been installed on both the production and backup systems. A spokesman for the software vendor reportedly stated that ‘It had nothing to do with the integrity of the software. It was human error.’
· On June 4 1996 the first flight of the European Space Agency’s new Ariane 5 rocket failed shortly after launching, resulting in an estimated uninsured loss of a half billion dollars. It was reportedly due to the lack of exception handling of a floating-point error in a conversion from a 64-bit integer to a 16-bit signed integer.
· Software bugs caused the bank accounts of 823 customers of a major U.S. bank to be credited with $924,844,208.32 each in May of 1996, according to newspaper reports. The American Bankers Association claimed it was the largest such error in banking history. A bank spokesman said the programming errors were corrected and all funds were recovered.
· Software bugs in a Soviet early-warning monitoring system nearly brought on nuclear war in 1983, according to news reports in early 1999. The software was supposed to filter out false missile detections caused by Soviet satellites picking up sunlight reflections off cloud-tops, but failed to do so. Disaster was averted when a Soviet commander, based on a what he said was a ‘…funny feeling in my gut’, decided the apparent missile attack was a false alarm. The filtering software code was rewritten.
Why is it often hard for management to get serious about quality assurance?
Solving problems is a high-visibility process; preventing problems is low-visibility. This is illustrated by an old parable:
In ancient China there was a family of healers, one of whom was known throughout the land and employed as a physician to a great lord. The physician was asked which of his family was the most skillful healer. He replied,
“I tend to the sick and dying with drastic and dramatic treatments, and on occasion someone is cured and my name gets out among the lords.”
“My elder brother cures sickness when it just begins to take root, and his skills are known among the local peasants and neighbors.”
“My eldest brother is able to sense the spirit of sickness and eradicate it before it takes form. His name is unknown outside our home.”
· Miscommunication or no communication - as to specifics of what an application should or shouldn’t do (the application’s requirements).
· Software complexity - the complexity of current software applications can be difficult to comprehend for anyone without experience in modern-day software development. Windows-type interfaces, client-server and distributed applications, data communications, enormous relational databases, and sheer size of applications have all contributed to the exponential growth in software/system complexity. And the use of object-oriented techniques can complicate instead of simplify a project unless it is well engineered.
· Programming errors - programmers, like anyone else, can make mistakes.
· Changing requirements - the customer may not understand the effects of changes, or may understand and request them anyway - redesign, rescheduling of engineers, effects on other projects, work already completed that may have to be redone or thrown out, hardware requirements that may be affected, etc. If there are many minor changes or any major changes, known and unknown dependencies among parts of the project are likely to interact and cause problems, and the complexity of keeping track of changes may result in errors. Enthusiasm of engineering staff may be affected. In some fast-changing business environments, continuously modified requirements may be a fact of life. In this case, management must understand the resulting risks, and QA and test engineers must adapt and plan for continuous extensive testing to keep the inevitable bugs from running out of control.
· Time pressures - scheduling of software projects is difficult at best, often requiring a lot of guesswork. When deadlines loom and the crunch comes, mistakes will be made.
· Egos - people prefer to say things like:
· ‘no problem’
· ‘piece of cake’
· ‘I can whip that out in a few hours’
· ‘it should be easy to update that old code’
· instead of:
· ‘that adds a lot of complexity and we could end up
· making a lot of mistakes’
· ‘we have no idea if we can do that; we’ll wing it’
· ‘I can’t estimate how long it will take, until I
· take a close look at it’
· ‘we can’t figure out what that old spaghetti code
· did in the first place’
· If there are too many unrealistic ‘no problem’s’, the
· result is bugs.
· Poorly documented code - it’s tough to maintain and modify code that is badly written or poorly documented; the result is bugs. In many organizations management provides no incentive for programmers to document their code or write clear, understandable code. In fact, it’s usually the opposite: they get points mostly for quickly turning out code, and there’s job security if nobody else can understand it (’if it was hard to write, it should be hard to read’).
· Software development tools - visual tools, class libraries, compilers, scripting tools, etc. often introduce their own bugs or are poorly documented, resulting in added bugs.
· A lot depends on the size of the organization and the risks involved. For large organizations with high-risk (in terms of lives or property) projects, serious management buy-in is required and a formalized QA process is necessary.
· Where the risk is lower, management and organizational buy-in and QA implementation may be a slower, step-at-a-time process. QA processes should be balanced with productivity so as to keep bureaucracy from getting out of hand.
· For small groups or projects, a more ad-hoc process may be appropriate, depending on the type of customers and projects. A lot will depend on team leads or managers, feedback to developers, and ensuring adequate communications among customers, managers, developers, and testers.
· In all cases the most value for effort will be in requirements management processes, with a goal of clear, complete, testable requirement specifications or expectations.
What is verification? validation?
Verification typically involves reviews and meetings to evaluate documents, plans, code, requirements, and specifications. This can be done with checklists, issues lists, walkthroughs, and inspection meetings. Validation typically involves actual testing and takes place after verifications are completed. The term ‘IV & V’ refers to Independent Verification and Validation.
What’s an ‘inspection’?
An inspection is more formalized than a ‘walkthrough’, typically with 3-8 people including a moderator, reader, and a recorder to take notes. The subject of the inspection is typically a document such as a requirements spec or a test plan, and the purpose is to find problems and see what’s missing, not to fix anything. Attendees should prepare for this type of meeting by reading thru the document; most problems will be found during this preparation. The result of the inspection meeting should be a written report. Thorough preparation for inspections is difficult, painstaking work, but is one of the most cost effective methods of ensuring quality. Employees who are most skilled at inspections are like the ‘eldest brother’ in the parable in ‘Why is it often hard for management to get serious about quality assurance?’. Their skill may have low visibility but they are extremely valuable to any software development organization, since bug prevention is far more cost-effective than bug detection.
· Black box testing - not based on any knowledge of internal design or code. Tests are based on requirements and functionality.
· White box testing - based on knowledge of the internal logic of an application’s code. Tests are based on coverage of code statements, branches, paths, conditions.
· Unit testing - the most ‘micro’ scale of testing; to test particular functions or code modules. Typically done by the programmer and not by testers, as it requires detailed knowledge of the internal program design and code. Not always easily done unless the application has a well-designed architecture with tight code; may require developing test driver modules or test harnesses.
· Incremental integration testing - continuous testing of an application as new functionality is added; requires that various aspects of an application’s functionality be independent enough to work separately before all parts of the program are completed, or that test drivers be developed as needed; done by programmers or by testers.
· Integration testing - testing of combined parts of an application to determine if they function together correctly. The ‘parts’ can be code modules, individual applications, client and server applications on a network, etc. This type of testing is especially relevant to client/server and distributed systems.
· Functional testing - black box type testing geared to functional requirements of an application; this type of testing should be done by testers. This doesn’t mean that the programmers shouldn’t check that their code works before releasing it (which of course applies to any stage of testing.)
· System testing - black box type testing that is based on overall requirements specifications; covers all combined parts of a system.
· End-to-end testing - similar to system testing; the ‘macro’ end of the test scale; involves testing of a complete application environment in a situation that mimics real-world use, such as interacting with a database, using network communications, or interacting with other hardware, applications, or systems if appropriate.
· Sanity testing - typically an initial testing effort to determine if a new software version is performing well enough to accept it for a major testing effort. For example, if the new software is crashing systems every 5 minutes, bogging down systems to a crawl, or destroying databases, the software may not be in a ’sane’ enough condition to warrant further testing in its current state.
· Regression testing - re-testing after fixes or modifications of the software or its environment. It can be difficult to determine how much re-testing is needed, especially near the end of the development cycle. Automated testing tools can be especially useful for this type of testing.
· Acceptance testing - final testing based on specifications of the end-user or customer, or based on use by end-users/customers over some limited period of time.
· Load testing - testing an application under heavy loads, such as testing of a web site under a range of loads to determine at what point the system’s response time degrades or fails.
· Stress testing - term often used interchangeably with ‘load’ and ‘performance’ testing. Also used to describe such tests as system functional testing while under unusually heavy loads, heavy repetition of certain actions or inputs, input of large numerical values, large complex queries to a database system, etc.
· Performance testing - term often used interchangeably with ’stress’ and ‘load’ testing. Ideally ‘performance’ testing (and any other ‘type’ of testing) is defined in requirements documentation or QA or Test Plans.
· Usability testing - testing for ‘user-friendliness’. Clearly this is subjective, and will depend on the targeted end-user or customer. User interviews, surveys, video recording of user sessions, and other techniques can be used. Programmers and testers are usually not appropriate as usability testers.
· Install/uninstall testing - testing of full, partial, or upgrade install/uninstall processes.
· Recovery testing - testing how well a system recovers from crashes, hardware failures, or other catastrophic problems.
· Security testing - testing how well the system protects against unauthorized internal or external access, willful damage, etc; may require sophisticated testing techniques.
· Compatibility testing - testing how well software performs in a particular hardware/software/operating system/network/etc. environment.
· Exploratory testing - often taken to mean a creative, informal software test that is not based on formal test plans or test cases; testers may be learning the software as they test it.
· Ad-hoc testing - similar to exploratory testing, but often taken to mean that the testers have significant understanding of the software before testing it.
· User acceptance testing - determining if software is satisfactory to an end-user or customer.
· Comparison testing - comparing software weaknesses and strengths to competing products.
· Alpha testing - testing of an application when development is nearing completion; minor design changes may still be made as a result of such testing. Typically done by end-users or others, not by programmers or testers.
· Beta testing - testing when development and testing are essentially completed and final bugs and problems need to be found before final release. Typically done by end-users or others, not by programmers or testers.
· Mutation testing - a method for determining if a set of test data or test cases is useful, by deliberately introducing various code changes (’bugs’) and retesting with the original test data/cases to determine if the ‘bugs’ are detected. Proper implementation requires large computational resources.
· Poor requirements - if requirements are unclear, incomplete, too general, or not testable, there will be problems.
· Unrealistic schedule - if too much work is crammed in too little time, problems are inevitable.
· Inadequate testing - no one will know whether or not the program is any good until the customer complains or systems crash.
· Futurities - requests to pile on new features after development is underway; extremely common.
· Miscommunication - if developers don’t know what’s needed or customer’s have erroneous expectations, problems are guaranteed.