XMidi is a pair of "command line" programs to convert between MIDI and XMidi (XML) file formats. It is written entirely in Java, and is meant to be "open source". The MX program converts from MIDI to XMidi, while the XM program converts from XMidi to MIDI.
This document attempts to explain how to use the program and the XMidi (XML) format.
There may be some slight differences between command line usage of different operating systems. I will ignore them in this section. All examples assume a Windows environment, using the (DOS) Command Prompt window.
The syntax for both MX and XM is identical. Options have a leading hyphen. Non-option arguments do not have a leading hyphen. Options may occur in any order and may be interspersed with non-option arguments. Non-option arguments must be in order; the first is the input file and the second is the output file. Both must be present. Thus,
runMX blah.mid -t blah.xmlis the same as
runMX -t blah.mid blah.xmlor
runMX blah.mid blah.xml -tI have included runMX.cmd and runXM.cmd as a simple way of invoking the programs. The contents of runXM.cmd is:
java com.palserv.XMidi.XM %1 %2 %3 %4 %5 %6As you can see, it is very simple.
The first non-option argument on the XM or MX command line is the input file. For MX it is a MIDI file and should end with ".mid", while the XM input file is an XMidi file and should end with ".xml"
The second non-option argument on the XM or MX command line is the output file. For XM it is a MIDI file and should end with ".mid", while the MX output file is an XMidi file and should end with ".xml"
As of this writing (3/21/05) there are two options, -t and -v. Neither has any effect on XM. The -t option is for testing of the MX program. If coded, the comment is not included in the output. If not coded the default is that MX is not in test mode.
The -v option is for "verbose" output from MX. In verbose mode, MX will produce extra tags and attributes which are ignored by XM, but which may make the output more easily readable by people. If not coded, the default is that MX is not in verbose mode.
In designing the XMidi XML format, I attempted to stay as close as I could to the actual format of MIDI data within a MIDI file. My information on MIDI files came from the web site at http://www.borg.com/~jglatt/index.htm which contains lots of technical information about MIDI. I have followed the naming conventions used there.
I am not going to try to explain XML or what is an XML format in this document. Rather, I refer you to the W3C XML page which has all the answers. For each tag, I provide a portion of the XMidi.dtd DTD file which "validates" that tag. Again, refer to the W3C link, above, for information about DTDs.
<!ELEMENT XMidi (CHUNK | MThd | Mtrk )*> <!ATTLIST XMidi VERSION CDATA #REQUIRED > |
The XMidi tag is the "root" (DOCTYPE) tag of an XMidi file. Its children (MThd, MTrk, and CHUNK) may only occur as children of this tag.
The VERSION attribute of the XMidi tag identifies the XMidi version for this file. Currently, it is "1.4".
<!ELEMENT MThd (FORMAT, TRACKS, PPNQ )> <!ATTLIST XThd TYPE CDATA #REQUIRED LENGTH CDATA #REQUIRED > |
The MThd tag marks the beginning of an MThd (header) chunk. There should be only one MThd per file. Its children (FORMAT, TRACKS and PPNQ) may only occur as children of this tag.
The TYPE attribute of the MThd should be "MThd".
The LENGTH attribute of the MThd should be "6".
<!ELEMENT FORMAT (#PCDATA )> |
The FORMAT tag specifies in its content, the MIDI format. This tag has neither children tags nor attributes.
<!ELEMENT TRACKS (#PCDATA )> |
The TRACKS tag specifies in its content, the number of MIDI tracks. This tag has neither children tags nor attributes.
<!ELEMENT PPNQ (#PCDATA )> |
The PPNQ tag specifies in its content, the number of Pulses (i.e. clocks) Per Quarter Note in hexidecimal. The tag name should be "PPQN", but my mind got turned around when I was writing the program and it came out PPNQ. (Hey, I'm dyslectic, KO?) This value is sometimes called the MIDI "resolution". This tag has neither children tags nor attributes. A "normal" value for the MIDI resolution is 120, which would be hex 78.
<!ELEMENT MTrk (DELTA* )> <!ATTLIST XThd TYPE CDATA #REQUIRED LENGTH CDATA #REQUIRED NUMBER CDATA #IMPLIED > |
The MTrk tag marks the beginning of an MTrk (MIDI track) chunk. Its child (DELTA) may only occur as a child of this tag. All of the "music" data occurs within MIDI tracks.
The TYPE attribute of the MTrk should be "MTrk".
The LENGTH attribute of the MTrk is the length of the MIDI data in that track, not counting the 8 byte chunk header.
The NUMBER attribute of the MTrk is the MIDI track number. The MIDI tracks are numbered sequentially in the order they appear.
<!ELEMENT DELTA (STATUS | EDATA | CHANNEL | TIME_SIG )*> <!ATTLIST DELTA DTIME CDATA #REQUIRED DECTIME CDATA #IMPLIED > |
The DELTA tag indicates a time value and what occurs at that time value. The delta values are always relative to the last delta or to the beginning of the track. The "units" of delta values are called "ticks". The number of ticks per quarter note is specified in the PPNQ tag.
Since multiple notes can occur within one track at the same time, a DELTA tag can contain multiple MIDI "events".
The DTIME attribute of the DELTA is the delta value expressed as a hexidecimal number.
The DECTIME attribute of the DELTA is the delta value expressed as a decimal number. It only appears in the output of MX if -v (verbose) is coded. It is meant to improve human readability.
<!ELEMENT STATUS (EDATA | CHANNEL )*> <!ATTLIST STATUS SNAM CDATA #REQUIRED SNMT CDATA #IMPLIED SVAL CDATA #REQUIRED SLEN CDATA #REQUIRED > |
The STATUS tag reveals information about one or more MIDI events. To save space in MIDI files, the "status" data does not have to be repeated for each event unless it changes. This usually occurs with CHANNEL tags. It is not uncommon to see a series of DELTA tags where the first has a STATUS (which contains a CHANNEL, etc.) and the rest do not. They have CHANNEL children. If this is unclear, look at the XML output for almost any MIDI file.
The SNAM attribute of the STATUS is the status "name". It is included for human readability.
The (optional) SNAM attribute of the STATUS is the "Non Midi Type". It is included for human readability.
The SVAL attribute of the STATUS is the actual status value in hexidecimal.
The SLEN attribute of the STATUS is the length of the status. It tells a program how much data is covered by this status.
<!ELEMENT EDATA (#PCDATA )> |
The EDATA tag can be a child of a DELTA or a STATUS tag. It contains a string of hexidecimal digits representing the data for the delta or status.
<!ELEMENT CHANNEL (NOTE_OFF | NOTE_ON | AFTER | CONTROL | PROGRAM | PRESSURE | WHEEL )> <!ATTLIST CHANNEL TYPE CDATA #REQUIRED NUMBER CDATA #REQUIRED > |
The CHANNEL tag covers midi messages 80-EF (hex). in each of these cases, the first nibble of the status byte tells us the event type, while the second nibble tells us the channel number. Depending on the event type, the status byte is followed by 1 or 2 data bytes.
The TYPE attribute of the CHANNEL tag is the event type.
The NUMBER attribute of the CHANNEL tag is the channel number.
<!ELEMENT NOTE_OFF EMPTY > <!ATTLIST NOTE_OFF PITCH CDATA #REQUIRED VELOCITY CDATA #REQUIRED NAME CDATA #IMPLIED REGISTER CDATA #IMPLIED > |
The NOTE_OFF tag represents a note off event. This occurs when a note (which began with a NOTE_ON tag) ends. See the description of the NOTE_ON tag for an explanation of a virtual note off event.
The PITCH attribute of the NOTE_OFF tag is the MIDI pitch of the note. It is expressed as a decimal integer. This allows the event to be related to a NOTE_ON event.
The VELOCITY attribute of the NOTE_OFF tag is the MIDI velocity of the note. (MIDI velocity controls volume.) It is expressed as a decimal integer.
The NAME attribute of the NOTE_OFF tag is the MIDI pitch of the note. It is expressed as a string such as "C#" or "Bb". It is based on the value of the PITCH attribute, but in a human readable form, rather than a number. This attribute is only included in verbose mode (-v).
The REGISTER attribute of the NOTE_OFF tag is the MIDI register of the note. It is expressed as a number. It is based on the value of the PITCH attribute. This number can range from minus five to plus five. Each number represents one octave from "C" up to "B". The octave with "Middle C" as its lowest note is register zero. Thus middle C has a pitch of sixty, a register of zero, and a name of "C". This attribute is only included in verbose mode (-v).
<!ELEMENT NOTE_ON EMPTY > <!ATTLIST NOTE_ON PITCH CDATA #REQUIRED VELOCITY CDATA #REQUIRED NAME CDATA #IMPLIED REGISTER CDATA #IMPLIED > |
The NOTE_ON tag represents a note on event. This occurs at the beginning of each note. Each note is supposed to end with a note off event, which is represented by a NOTE_OFF tag. However, a "tradition" has arisen where there is such a thing as a virtual note off event. A virtual note off event is a note on event with a velocity of zero.
One possible reason for this "tradition" may be to save space in a MIDI file. NOTE_ON and NOTE_OFF are different events. If a track had a bunch of alternating NOTE_ON and NOTE_OFF events, it would need a status byte for each one. If, however, each NOTE_OFF event could be represented by a NOTE_ON event with a velocity of zero, then it would be a bunch of NOTE_ON events, which could all be handled with one status. This would save approximately the number of bytes as there are notes in the track. It may help to keep in mind that at the time the MIDI file format was being formulated, disk drives were a lot smaller than they are now.
The PITCH attribute of the NOTE_ON tag is the MIDI pitch of the note. It is expressed as a decimal integer.
The VELOCITY attribute of the NOTE_ON tag is the MIDI velocity of the note. (MIDI velocity controls volume.) It is expressed as a decimal integer. A zero value implies that this is a virtual NOTE_OFF tag.
The NAME attribute of the NOTE_ON tag is the MIDI pitch of the note. It is expressed as a string such as "C#" or "Bb". It is based on the value of the PITCH attribute, but in a human readable form, rather than a number. This attribute is only included in verbose mode (-v).
The REGISTER attribute of the NOTE_ON tag is the MIDI register of the note. It is expressed as a number. It is based on the value of the PITCH attribute. This number can range from minus five to plus five. Each number represents one octave from "C" up to "B". The octave with "Middle C" as its lowest note is register zero. Thus middle C has a pitch of sixty, a register of zero, and a name of "C". This attribute is only included in verbose mode (-v).
<!ELEMENT AFTER EMPTY > <!ATTLIST AFTER PITCH CDATA #REQUIRED PRESSURE CDATA #REQUIRED NAME CDATA #IMPLIED REGISTER CDATA #IMPLIED > |
The AFTER tag represents an after touch event. As best as I can tell, some keyboards allow the player to hit a key softly, then push down harder afterwards.
The PITCH attribute of the AFTER tag is the MIDI pitch of the note. It is expressed as a decimal integer. This allows the event to be related to a NOTE_ON event.
The PRESSURE attribute of the AFTER tag is the new MIDI velocity of the note. (MIDI velocity controls volume.) It is expressed as a decimal integer.
The NAME attribute of the AFTER tag is the MIDI pitch of the note. It is expressed as a string such as "C#" or "Bb". It is based on the value of the PITCH attribute, but in a human readable form, rather than a number. This attribute is only included in verbose mode (-v).
The REGISTER attribute of the AFTER tag is the MIDI register of the note. It is expressed as a number. It is based on the value of the PITCH attribute. This number can range from minus five to plus five. Each number represents one octave from "C" up to "B". The octave with "Middle C" as its lowest note is register zero. Thus middle C has a pitch of sixty, a register of zero, and a name of "C". This attribute is only included in verbose mode (-v).
<!ELEMENT CONTROL EMPTY > <!ATTLIST CONTROL NUMBER CDATA #REQUIRED VALUE CDATA #REQUIRED NAME CDATA #IMPLIED > |
The CONTROL tag represents a control event.
The NUMBER attribute of the CONTROL tag is the MIDI control number of the event. It is expressed as a decimal integer.
The VALUE attribute of the CONTROL tag is the MIDI control value of the event. It is expressed as a decimal integer.
The NAME attribute of the CONTROL tag is the MIDI control number of the event. It is expressed as a string such as "Bank Select" or "Portamento Time (coarse)". It is based on the value of the NUMBER attribute, but in a human readable form, rather than a number. This attribute is only included in verbose mode (-v).
<!ELEMENT PROGRAM EMPTY > <!ATTLIST PROGRAM NUMBER CDATA #REQUIRED GMNAME CDATA #IMPLIED > |
The PROGRAM tag represents a program (patch) change event.
The NUMBER attribute of the PROGRAM tag is the MIDI program (patch) change event. It is expressed as a decimal integer.
The GMNAME attribute of the PROGRAM tag is the General MIDI name of the program (patch). It is expressed as a string such as "Acoustic Grand Piano" or "Bird Tweet". It is based on the value of the NUMBER attribute, but in a human readable form, rather than a number. This attribute is only included in verbose mode (-v).
<!ELEMENT PRESSURE EMPTY > <!ATTLIST PRESSURE AMOUNT CDATA #REQUIRED > |
The PRESSURE tag represents a MIDI channel pressure message.
The AMOUNT attribute of the PRESSURE tag is the MIDI channel pressure amount. It is expressed as a decimal integer.
<!ELEMENT WHEEL EMPTY > <!ATTLIST WHEEL AMOUNT CDATA #REQUIRED > |
The WHEEL tag represents a MIDI pitch wheel event.
The AMOUNT attribute of the WHEEL tag is the MIDI pitch wheel event amount. It is expressed as a decimal integer.
<!ELEMENT TIME_SIG EMPTY > <!ATTLIST TIME_SIG NN CDATA #REQUIRED DD CDATA #REQUIRED CC CDATA #REQUIRED BB CDATA #REQUIRED SIG CDATA #REQUIRED > |
The TIME_SIG tag represents a time signature event.
The NN attribute of the TIME_SIG tag is the numerator of the time signature. It is expressed as a decimal integer.
The DD attribute of the TIME_SIG tag is the denominator of the time signature. It is expressed as a decimal integer.
The CC attribute of the TIME_SIG tag is the number of MIDI clocks in a metronome click. It is expressed as a decimal integer.
The BB attribute of the TIME_SIG tag is the number of notated 32nd notes in a MIDI quarter note (24 MIDI clocks). It is expressed as a number.
The SIG attribute of the TIME_SIG tag is the actual time signature. It is expressed as a string such as "4/4" or "3/16". This attribute is ignored by XM, but produced by MX.
<!ELEMENT CHUNK EMPTY > <!ATTLIST CHUNK TYPE CDATA #REQUIRED LENGTH CDATA #REQUIRED HEXHDR CDATA #REQUIRED STRHDR CDATA #IMPLIED > |
The CHUNK tag represents a "chunk". Most chunks are MThd or MTrk chunks. For any other kind of chunk use CHUNK.
The TYPE attribute of the CHUNK tag is the chunk type. It should be "CHUNK".
The LENGTH attribute of the CHUNK tag is the chunk length. It is expressed as a decimal integer.
The HEXHDR attribute of the CHUNK tag is the eight byte chunk header expressed as a hexidecimal string.
The STRHDR attribute of the CHUNK tag is the eight byte chunk header expressed as a string, with unprintable characters replaced by periods. It is only produced in verbose mode.
<!ELEMENT HEXDATA (#PCDATA )> |
The HEXDATA tag reveals the data of a chunk as one or more strings of hexidecimal data.
<!ELEMENT STRINGREP (#PCDATA )> |
The STRINGREP tag reveals the data of a chunk as one or more strings, with unprintable characters replaced by periods. It is only produced in verbose mode.