Kara-Moon Forum
April 19, 2024, 12:24:20 AM *
Welcome, Guest. Please login or register.

Login with username, password and session length
News: You can go back to the main site here: Kara-Moon site
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: Input encoding  (Read 2845 times)
sciurius
Sr. Member
****
Posts: 443



« on: December 27, 2018, 12:32:56 PM »

A footnote in the documentation reads:

Quote
MMA is pretty open about the “encoding” of the file, but to keep Python 3.x happy you should use “cp1252” (a standard
Windows format).

Can you elaborate on this apparent restriction?
Logged
bvdp
Kara-Moon Master
****
Posts: 1437


WWW
« Reply #1 on: December 27, 2018, 04:02:53 PM »

A footnote in the documentation reads:

Quote
MMA is pretty open about the “encoding” of the file, but to keep Python 3.x happy you should use “cp1252” (a standard
Windows format).

Can you elaborate on this apparent restriction?

It's pretty much just a matter of what the various routines which do character conversions are happy with. If you prepare your input files with what we used to call ASCII or Latin8 you'll be fine. If you want more details than my little brain can provide, a starting point is: https://en.wikipedia.org/wiki/Windows-1252

MMA will get upset and probably crash and burn and delete all the data on the servers in Washington, DC if it encounters non-ascii data in it's input ... but, for the most part it's nothing to worry about Smiley
Logged

My online life: http://www.mellowood.ca
sciurius
Sr. Member
****
Posts: 443



« Reply #2 on: December 28, 2018, 02:35:13 PM »

So far I've been unable to delete all the data on the servers in Washington  Cool.

There are three places where the cp1252 encoding is enforced:

  • When opening the .mma source
    This can be dealt with by opening the file in raw mode, and try convert it from utf8 first, if that fails, use cp1252.
  • When decoding strings read from MIDI
  • When encoding strings written to MIDI
    Unfortunately there is no officially defined way to set encodings in the MIDI file, but there are some ways to deal with this. Think the popularity of Karaoke in Japan.

I'll try to work out some enhancements. Now if only we had a git repo  Grin.
Logged
bvdp
Kara-Moon Master
****
Posts: 1437


WWW
« Reply #3 on: December 28, 2018, 05:31:22 PM »

Until I converted MMA to work in both python 2 and 3 there was no encoding at all. It's really just a "problem" with python3.x Smiley

However, I don't see it really being that much of a problem.

 - when opening source files in PY3 one really does need to guess as to the nature of the file. I don't think that restricting to a "latin 8" type of character set is a big deal. If non-english characters are needed, they can be inserted as multi byte things.

 - I really don't have any access to non-latin8 data. But, it might be a thought to have an environment variable "MMA_ENCODING" and to insert that for encoding values in the 3 locations where it is used. At least I'd be off the hook if there are any problems Smiley Easy enough to do at this end: Just look for the variable and save it in globals and then insert it when needed. I think I picked cp1252 as a "reasonable value to use".
Logged

My online life: http://www.mellowood.ca
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines Valid XHTML 1.0! Valid CSS!
Page created in 0.052 seconds with 19 queries.