Monday, 30 September 2013

mechanize: first form works, then "unknown GET form encoding type 'utf-8'"

mechanize: first form works, then "unknown GET form encoding type 'utf-8'"

I am trying to fill out 2 forms from the EUR-Lex website in order to
record some data from the generated webpage. I am stuck at form #2. I get
the feeling this should be easy and I've researched a bit, but no luck.
import mechanize
froot = '...'
f = open(froot + 'text.html', 'w')
br = mechanize.Browser()
br.open('http://eur-lex.europa.eu/RECH_legislation.do')
br.select_form(name='form2')
br['T1'] = ['V112']
br['T3'] = ['V2']
br['T2'] = ['V1']
first_page = br.submit()
f.write(first_page.get_data())
up until here everything seems to work, because I get the source of the
correct page saved to the file. But then...
br.select_form(name='form2')
br['typedate'] = ['PD']
br['startaaaa'] = '1960'
br['startmm'] = '01'
br['startjj'] = '01'
br['endaaaa'] = '1960'
br['endmm'] = '12'
br['startjj'] = '31'
next = br.submit()
here everything stops:
ValueError: unknown GET form encoding type 'utf-8'
I checked br.enctype before selecting the first and second forms. What I
get is:
after the first form: application/x-www-form-urlencoded
after the second form: utf-8
I don't know what is going on here.

No comments:

Post a Comment