Python ir Lietuviškos raidės

T
Techtronic
Mindaugas N.
  • 26 Bal '12
UnicodeEncodeError: 'ascii' codec can't encode character u'\u017e' in position 22: ordinal not in range(128)

Kaip priversti Python rašyti lietuviškomis raidėmis?

A
  • 26 Bal '12

Ar failo pradžioje yra nurodyta, kad naudotų utf-8 kodavimą?

# -*- coding: utf-8 -*-
T
Techtronic
Mindaugas N.
  • 26 Bal '12

Taip

# coding=utf-8
A
  • 26 Bal '12

Jei bandai sukišti lietuviškas raides į kokį failą tai panaši problema čia aprašoma.
Jeigu nepataikiau tai praverstu nedidelis kodo gabaliukas, kurioje vietoje feilina.

T
Techtronic
Mindaugas N.
  • 26 Bal '12

Tai sudetinga, nes failas nera vienas ir dar naudoja json (gauna is web kuriam reikia dev KEY ir limituoja request'us).

Na bet kaip pvz:

wapi = urllib2.urlopen(apiurl)
json_string = wapi.read()
parsed_json = json.loads(json_string)
wapi.close()

O veliau open("data","w") ...

T
Techtronic
Mindaugas N.
  • 26 Bal '12

json_string yra (dalis jo):

"tempm":"4.0", "tempi":"39.2","dewptm":"4.0", "dewpti":"39.2","hum":"100","wspdm":"3.7", "wspdi":"2.3","wgustm":"-9999.0", "wgusti":"-9999.0","wdird":"140","wdire":"Pietryčių","vism":"2.2", "visi":"1.4","pressurem":"1011", "pressurei":"29.86","windchillm":"-999", "windchilli":"-999","heatindexm":"-9999", "heatindexi":"-9999","precipm":"-9999.00", "precipi":"-9999.00","conds":"Migla","icon":"hazy","fog":"0","rain":"0","snow":"0","hail":"0","thunder":"0","tornado":"0","metar":"METAR EYVI 260150Z 14002KT 2200 BR PRFG NSC 04/04 Q1011 BECMG 0350 FG" },
        {
        "date": {
        "pretty": "05:20 AM EEST on balandžio 26, 2012",
        "year": "2012",
        "mon": "04",
        "mday": "26",
        "hour": "05",
        "min": "20",
        "tzname": "Europe/Vilnius"
        },

Reikia write i faila nauja table:

foo = {
  pretty = "05:20 AM EEST on balandžio 26, 2012",
  bar = "baz"
}
A
  • 26 Bal '12

@Techtronic rašė:
Tai sudetinga, nes failas nera vienas ir dar naudoja json

Sunku pasakyti, ryškiai kažkur encodingas sufeilina ir nusisetina į ascii. Pagal šį straipsnį rekomenduoja naudoti 'codecs' modulį. Be blogo encodingo gali būti ir niuansai su BOM (geriau jo nenaudoti).

W
  • 26 Bal '12

http://docs.python.org/library/json.html

If the contents of fp are encoded with an ASCII based encoding other than UTF-8 (e.g. latin-1), then an appropriate encoding name must be specified. Encodings that are not ASCII based (such as UCS-2) are not allowed, and should be wrapped with codecs.getreader(encoding)(fp), or simply decoded to a unicode object and passed to loads().

Kitaip tariant, issiaiskint kaip encodintas tavo stringas ir

json.load(json_string, encoding='iso-8859-13')
T
Techtronic
Mindaugas N.
  • 26 Bal '12

Padariau kitaip, is json atsirenku tai kas reikalinga ir sumetu i kita zodyna:

'observation_time': u'Last Updated on baland\u017eio 26, 3:20 PM EEST',

Dabar saugo gerai.