This data contamination could consist of inserting new information (such as websites on a log) or overwriting prior data.
This proliferation of devices and software implies a concomitant expansion of analysis tools. Where these tools exist, they invariably have some cost asociated with them. however the proliferation of devices and systems also means that there are often no such tools available, at least not yet. We also need to know that the vendor of any such tool is accredited.
Another byproduct of technological advances is the massive increase in storage capacity, giving more disc space to inspect, slowing down imaging.
Also the availablilty of cheap storage encourages people to make more backups over new devices and media.
All of these have the effect of significantly increasing the workload of the forensic analyst, as we saw in Rob's guest lecture a few weeks back.
This "off-site" storage is manifested in Web hosting services, shared document services (such as GoogleDocs), peer-to-peer, and even in mail services (gmail and hotmail). There have been some privacy concerns with these services, sometimes due to software errors. However the agencies hosting these services are able to collect a significant quantity of personal information.
These sharing services allow users to do at a remote site what would previously have been done on a file kept on their local hard disc. Getting a copy of that file for analysis might depend on finding a cached copy.
A lot of evidence is also contained in records held by third parties. This includes web logs, query logs, site accesses, banking or commercial records. While law enforcement agencies can sometimes successfully access such data through a court order, this option is not open to industrial forensics analysts.
Even when data is accessible, it can comprise an enormous quantity of information. For example, web logs can yield up information on a target's interests, complete with date and downloaded data. However an analyst ends up having to sift through large numbers of entries, most of which are relatively anonymous.
A similar issue occurs with intelligence agencies who collect very large quantities of data (perhaps the entire traffic via US servers?). It is not feasible for all intercepted materials to be inspected by a human agent, as it had been for example at Bletchley Park during WWII. Instead, intercepted data is purportedly text-analysed with keyword searches
However some of the problems cannot easily be addressed with technical solutions. Distributed storage is difficult to deal with since ISPs are not always willing to hand over their logs for analysis, (Google for example refusing to hand over their web logs to law enforcement agencies).
In our next section, our guest speaker from SAPOL will be discussing some of these issues, in particular how to deal with "cloud computing",