Python Pubmed
Web Scrape Pubmed Using Python Script
#!/usr/bin/env python ################## # PYTHON SCRIPT # PERFORM WEBSITE SCRAPE OF PUBMED # PULL RELEVANT ARTICLE INFO FROM WEBPAGE # FORMAT CONTENT FOR WIKI TEMPLATE # REQUIRES 'BeautifulSoup' # AUTHOR: BRADLEY MONK # LICENSE: GNU ################# import re # re.compile('<title>(.*)</title>') import urllib2 from bs4 import BeautifulSoup soup = BeautifulSoup(urllib2.urlopen('http://www.ncbi.nlm.nih.gov/pubmed/10731148').read()) print("#################------------------#################") #------- pubmed authors ---------# print("{{Article|") div_tag = soup.find_all('div', attrs={"class": "auths"}) for div_tag.a in div_tag: diva = div_tag.a for string in diva.strings: auts = string print(string) #------- pubmed authors ---------# print(auts) #------- pubmed year ------------# print("|") jouryear = soup.find_all(attrs={"class": "cit"}) year = jouryear[0].get_text() yearlength = len(year) titleend = year.find(".") year1 = titleend+2 year2 = year1+1 year3 = year2+1 year4 = year3+1 year5 = year4+1 print(year[year1:year5]) #------- pubmed year ------------# #------- pubmed journal ---------# journal = soup.find_all(attrs={"class": "cit"}) print("|") print(journal[0].a.string) #------- pubmed journal ---------# print("- [http://domain.com/linktofile.pdf PDF]") #--------- pubmed PMID -----------# PMID = soup.find_all(attrs={"class": "rprtid"}) print("|") print(PMID[0].dd.string) #--------- pubmed PMID -----------# #------- pubmed title ---------# title = soup.find_all(attrs={"class": "rprt abstract"}) print("|") print(title[0].h1.string) #------- pubmed title ---------# print("}}") print("{{ExpandBox|Expand to view experiment summary|") #------- pubmed abstract ---------# abstract = soup.find_all(attrs={"class": "abstr"}) print(abstract[0].p.string) #------- pubmed abstract ---------# print("}}")
Result
Hayashi Y Shi SH Esteban JA Piccini A Poncer JC Malinow R • 2000 • Science PDF
Expand to view experiment summary
To elucidate mechanisms that control and execute activity-dependent synaptic plasticity, alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionate receptors (AMPA-Rs) with an electrophysiological tag were expressed in rat hippocampal neurons. Long-term potentiation (LTP) or increased activity of the calcium/calmodulin-dependent protein kinase II (CaMKII) induced delivery of tagged AMPA-Rs into synapses. This effect was not diminished by mutating the CaMKII phosphorylation site on the GluR1 AMPA-R subunit, but was blocked by mutating a predicted PDZ domain interaction site. These results show that LTP and CaMKII activity drive AMPA-Rs to synapses by a mechanism that requires the association between GluR1 and a PDZ domain protein.