[BioNLP] Citation information

D.J.King D.J.King at open.ac.uk
Tue Oct 20 15:45:53 EDT 2009

As you say cURL and Wget can be tricky to use. I don't know about Python - it's on my to do list to learn the language - but in PHP both the fopen() and file_get_contents() functions successfully retrieve a Google Scholar web page. The PHP equivalent of typing piwowar and nlp in the search box on the Google Scholar page is:

$fout = fopen('piwowar_results.html', 'w');                                     // open output file
$buffer = file_get_contents('http://scholar.google.com/scholar?q=piwowar+nlp'); // read response from Google Scholar
fwrite($fout, $buffer);                                                         // write response to output file

The file piwowar_results.html now contains the web page, which can be parsed and the links for citations etc. passed to another file_get_contents() call and so on. The code is not very elegant, I admit, but it's a technique I've resorted to quite a few times when having to hack some code in a hurry.


The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).

More information about the BioNLP mailing list