Help:Research using the web

From SourceWatch
Jump to navigation Jump to search
Research using the web is a research guide. See the other research guides
SourceWatch Help

Start here:

Advanced tools:

More info:

Other help:

There are some tricks to saving time when using the web for research.

Bypassing newspaper registration requests

  • - Login names for more than 20,000 websites are available for those believe compulsory registration is a violation of privacy - or at least a waste of time. There is a bookmarklet available at the site. Here is a link to the Firefox extension to bypass autotmatically from the browser side.
  • Sogole Honarvar outlined in Poynter Online the secret to Avoiding registration at news sites. "Poynter Online reports a tip for those tired of registering at news Web sites in order to access free content. Using the log-in 'freethepresses', or '' if an email address is appropriate, and the password 'freethepresses' will provide users with quick access to news without the hassle of registration. This trick works at the Los Angeles Times site, The New York Times site and the sites for both The Chicago Tribune and the Washington Post. Although it is unclear who went through the trouble to create these fake log-ins, news junkies everywhere are surely thankful for the tip."
  • To generate permanent links to New York Times articles, used the NYT-supplied link generator here:

Assessing the quality of Information on the Internet

Web Searching

To clearly identify what you are searching for ... using George W. Bush as the example ... search by typing "George W. Bush" in the search line, making sure to bracket the full name in quotation marks. This keeps the George - W. - Bush together. Otherwise, the search will be for each part of his name and you will get some returns for George W. Bush as a name, but quickly find that the George and Bush are scattered throughout the results.

  • To expand on your search for George W. Bush, let's assume you want to find his military service record or simply about his military service, then you would type in "George W. Bush", "military service", which narrows your results. Using quotation marks (" ") to bracket or focus the search is key.
  • This technique can be expanded by searching for "George Bush", "military service" as the search, "George Bush", "military records", etc. The quotation marks hold the phrase or name together and variations on this type of search will bring more and more records. The search may bring results for "George H.W. Bush". To eliminate that possibility, you can try to search by typing in "George Bush", "military records" -"George H.W. Bush", which may eliminate the problem.
  • Whole quotes can be found using this technique, as well, which is very useful to locate quotes found out of context or to identify the quoter or source of a quote. Always be sure to encase the quote with quotation marks.
  • "Clustering" can also be helpful in this regard, such as that provided by Clusty. This allows you to more easily navigate broad, general queries by breaking results down into automatic categories. For example, a query on "George Bush" may yield a cluster labeled "military records."

See this article on "Ten Tips for Smarter Google Searches" for more search tips.

Searching and referencing books via Google Print

The Google Print service opens up an expanding range of printed books to use in SourceWatch research. This service, launched in October 2004, indexes the contents of books from publishers signed up to the service. To search it, search Google for books on topicname. For example, to search for mentions of Michael Ledeen in printed books, use the following Google search: books on "Michael Ledeen"

Linking to a scanned page in Google Print from SourceWatch is slightly fiddly. It is necessary to remove the prev= parameter from the URL, otherwise SourceWatch will not correctly format the link. Removing this parameter does not prevent the link working. For example, this is the link to the page from Reagan Presidency by Deborah Hart Strober that mentions Michael Ledeen: Here it is with the "prev=" paramater removed:

This latter form enables the link to be formatted thus: A page from the book Reagan Presidency

The system limits you to looking at only a few pages either side of the page returned by the search. Furthermore, the clipboard has been disabled so you cannot directly copy and paste text from the scanned pages of the book. This simply means you will have to retype the information - sorry!

Internet archives

  • To locate a seemingly "dead" URL, go to, type in the URL and search. Sometimes phrases or headlines will work to locate an article. However, there are limits to what can be retrieved. Website publishers can opt-out from the the Waybackmachine's indexing system. If information on a site is important for referencing a key point you are best off saving a copy to your computer.
  • The Internet Archive provides a way of finding web pages that have changed or disappeared.

Environmental news

  • An archive of Reuters environmental news stories is available at the website of the Australian environmental group Planet Ark. Access to the searchable archive is free and does not require a password or subscribing. You can also sign up for a daily e-mail summary of Reuters latest environmental stories.

Government research reports

Researching companies

Increasingly company information is available from a variety of online sites including:

  • corporate regulators: Many corporate regulators are requiring companies to file electronic copies of documents for public access such as annual reports, prospectuses (issued to the market to support a float or capital raising and which are required to sketch trends that may affect business forecasts) , changes of directors, major changes to shareholdings etc.
    • company sites: most major company websites list documents filed with corporate regulators though the extent of the archive varies. They also often include media releases and corporate social and environmental reports. (It is worth noting that it is common for media releases on controversial topics to be posted only for a short period and then removed to avoid drawing attention to a controversy).
  • LexisNexis provides a subscription only service to its extensive online databases. Nexis is a powerful research tool if you want to search newspaper articles around the world (or more narrowly) on a particular topic. It is an premium service priced for business clients and unafforable for most individuals. However, research sections at good public libraries often have subscriptions and will search topics for you. It is also common for journalism schools at universities to provide access for enrolled students. Costs aside, Nexis has its limits. Many publications only started providing electronic archives for the Nexis pool in the mid-1990's while - due to copyright restrictions - many articles by freelancers are also been excluded unless they have signed their rights over to the original publisher.
  • Many newspapers use the Accurint website for research. Accurint provides access to public information from hundreds of sources. It charges a fee for some searches, but the costs are fairly low.
  • Attorneys and private investigators often use PublicData, which requires a minimum yearly subscription of $25.
  • "Job Tracker, a site affiliated with the AFL-CIO with links to data on companies "exporting jobs, endangering workers' health or involved in cases of violations of workers' rights under the National Labor Relations Act."

Researching Wikipedia

You can research who has been making changes to Wikipedia by using the Wikiscanner tool, which searches for Wikipedia edits by IP address. Edits made to Wikipedia by logged-in users are recorded under their username while edits made while not logged-in are recorded under the IP address of the computer making the edit. Because the IP address range of many companies, organizations and government agencies are known, the Wikiscanner tool can be used to identify edits made from their computers by users who are not logged-in.

Many edits that violate Wikipedia's rules on conflict of interest edits - defined as "disregard[ing] the aims of Wikipedia to advance outside interests"[1] - are made while not logged-in, perhaps in an effort to disguise the identity of an editor engaged in "whitewashing" "greenwashing" or other PR-minded changes to an article. The Wikiscanner makes it possible to determine the location of the computer used to makes those edits. However, the tool generally cannot determine the identity of the specific person making the change and the fact that an edit is made on an organization's computer does not mean that the edits were officially sanctioned nor does an edit made while not logged-in necessarily make that edit malicious.

Wikiscanner's "Editors picks" include edits made by computers at the U.S. Senate (wikiscanner log), U.S. House of Representatives (wikiscanner log), Environmental Protection Agency (wikiscanner log), National Institute of Health (wikiscanner log), Democratic Party (wikiscanner log), Republican Party (wikiscanner log), NATO (wikiscanner log), Electronic Frontier Foundation (wikiscanner log), Rand Corporation (wikiscanner log), National Rifle Association (wikiscanner log), American Civil Liberties Union (wikiscanner log), Diebold Inc. (wikiscanner log), Amgen Inc. (wikiscanner log), Pfizer Inc. (wikiscanner log), Wal-Mart Stores Inc. (wikiscanner log), ExxonMobil (wikiscanner log) and Raytheon (wikiscanner log).

More external resources:

Economics and finance


Other SourceWatch resources