The Standard information tag is, quite simply an unofficial Yahoo tag which can be used to exchange information between different client software, without the need to code a different test for each client’s own ‘unique’ way of sharing information.
The format is very flexible, with very few ‘do not’s and ‘must’s, instead it has ‘should not’s and ‘should’s. Since this is all completely voluntary I would urge you to ‘obey’ any rigidity (marked as [rigid] ), if you have a good reason to disagree please let me know.
With the ever-increasing number of Yahoo chat clients comes an increase in the number of tags that client writers need to check for (or have their client rendered ‘blind’ to a particular chat-client’s name and/or features). Every check requires either ‘code’ or ‘data’, therefore every NewClient means all other NewClient-aware clients grow in size and require extra cycles checking for it and any feature-tags. Not to mention the Internet traffic generated (and host bandwidth used) by people download the new version.
Ok, some of that may be a bit over-board, but simply put, it will in the long run make client coders lives easier.
The tag itself starts with the standard Yahoo tag angle bracket <
This is followed with the Yahoo keyword font, this is followed by a single space character (ASCI char 32)
This is followed by the keyword INF
So far that gives us <font INF
Data is stored in Keyname:Value pairs.
Keyname is the name of the data (and is always preceded with a space character to separate it from the INF keyword or the preceding data) and Value is the data itself.
The INF tag ends at the next standard Yahoo tag terminator >
Should not be more than 8 characters (not including any suffixes)
Should not contain a space character (although they MAY, since the Keyname is terminated at the next colon character). If a colon is not found, then the Keyname is void and the rest of the INF tag should be considered invalid.
They ARE NOT case sensitive. [rigid]
The Keyname may end in a suffix, in which case the suffix is removed, it merely serves to indicate HOW the Value should be interpreted.
If a Keyname ends with the % suffix then Value should be interpreted as url-encoded, in that the % character (within the Value not the Keyname) will be followed by a ASCI representation of a hexadecimal value, this (including the % character) should be replaced with the ASCI character itself. – invalid hex digits should result in the Key being void.
If a Keyname ends with the $ suffix then Value should be interpreted as an ASCI representation of a Hexadecimal string, the actual value stored should be the ASCI characters returned by the hexadecimal pairs.
If neither of the above suffixes is found then Value is literal.
If the first character after the colon is a double quote ( “ ) character then Value is terminated at the next double quote char (the quote characters are NOT part of the actual value), otherwise Value is terminated at the next space (ASCI 32) character OR the end of the INF tag. [rigid]
Where the closing quotes character is missing then the Keyname:Value pair should be considered void (and the remainder of the INF tag discarded)
The following all set the Key harry to the same value of ABCD
The following both set welcome to hello there. (two words separated with a space)
Some Keynames have been defined, however since the INF tag is not owned or governed anyone can create (or own) Keynames, however people should be aware that anyone and everyone is free to NOT implement Keynames, or even define a new Keyname that serves the same purpose.
When defining a Keyname;
Try to be sure that it hasn’t been used already (ask if your not sure)
Try to be sure that there isn’t a Keyname already defined that serves the same purpose – or can you use an already defined Keyname instead?
Whilst someone can ‘redefine’ a Keyname in their client, that will return us to the same per client test from which (I believe) we should be trying to move away from (if ID=”this” then AVA=”that” else… type logic)
Keyname Purpose Example(s)
ID clientID JAM – YHLT – Ymlite – Yzak (alphabetical order)
VER client Version 5.1.2
AVA Avatar REPOSITORYCODE\avatarname (see note below)
TM local time (hh or hh:mm) 01 – 23 – 5:15 – 02:33
LTIME local time&date (Tdatetime) 38244.9497271528 (see below)
PROT protocol YCHT – CHAT2 – DHTML – YMSG
SUM checksum A423EE1F (see note below)
SEX Sex M F U N (Male, Female ,Unknown and Neutral)
GOS Munged text Secret codes… (see note below)
LOVE the person/thing you love “Mrs Troll” - Skiing - “making woopey”
GLY Glyph 18x18 pixel single colour images (see below)
The INF tag itself should be the first thing in the speech part of the chat room post.
If the addition of the INF tag to the chat post would cause the post to break a rule (for example it pushes the total length of the chat post over a server imposed limitation) then the INF tag should either be omitted in part (removal of less important data, I suggest a left to right, important to less important rule be adopted), or, the INF tag be completely omitted.
A more complicated rule could be used; data pairs only ‘need’
be re-transmitted if:
a) Someone new has joined the room,
b) The value of the data has changed
However, this method requires the server to be reliable. ;-)
Remember, implementation of Keys are optional.
REPOSITORY is a shorthand reference to a web-site where avatars are held for example:
If avatarname does not include a file extension (.jpg .bmp .gif etc) then .jpg is assumed. (in fact if the avatar is a jpg then the extension should not be included).
However, if you do not understand how this can be abused you should not implement it – or, disallow the option to download avatars from http:// style repositories by default, allowing the end-user to put their IP address at risk).
In this case the file extension SHOULD be included. And any client downloading avatars should check that it is in fact downloading a valid graphics file (this includes usable filetypes, if you cannot display GIFs in your Trichedit why download it!), NOT the latest virus/exploit/Trojan etc.
Also, you may want to decide upon a ‘default’ I for example try to download from the chat-help repository if no repository prefix is included. Simply because the chat-help repository is maintained and the images ‘screened’ for suitability.
This is the ASCI representation of a TdateTime (double floating point) - the integer part represents the number of days since the last day of 1899. The fraction part is how far ‘through’ a 24 hour day – for e.g. 0.25 is 6am 0.5 is 12 midday and 0.75 is 6pm.
This can be used to verify the data that precedes the SUM key in the INF tag.
The value is a reduced MD5 of:
The YahooID of the sender
The Current Chatroom (including the colon and its number)
The contents of the INF tag up to and including the space char preceding the SUM key.
Reduction can be achieved by XORing byte quads (first byte of reduced checksum = byte1 xor byte2 xor byte3 xor byte4 of the original MD5 checksum, second byte of reduced checksum = byte5 xor byte6 xor byte7 xor byte8 etc etc) – reducing the checksum down to 4 bytes, which in turn can be represented as 8 hex chars (for eg SUM:a56231ff – do NOT use SUM$:a56231ff since that would cause the value to be parsed and reduced to 4 bytes rather that a 8 char string)
Any data following a SUM Key should be noted as ‘unverified’
Why? Because one day it maybe useful to verify the INF tag, yes it can be faked, but its more secure than a Key called “isthisINFok” and setting a value of 1 if it is ;-)
I chose MD5 purely because it is a well documented and easily verifiable one way hashing algorithm which is ‘just’ difficult enough to prevent all but the more die-hard spoofer.
GOS has no ‘set’ method of ciphering/deciphering the value (all clients could have their own method) – therefore it would be prudent to check the ID Key (if present) before deciphering the text, or, include a method checking the validity of the deciphered text).
GLY value represents an 18 by 18 pixel 1 bit Image. Value will always be 55 bytes long (good, quick check for invalid data before you start decoding).
Bytes are encoded using the Y64 method – whereby:
The full char set is:
where ‘.’ is zero and ‘z’ is sixtythree (63).
The first byte of the value represents the foreground colour of the glyph, containing red green and blue values stored as 2 bits each (with the possible values of 0 to 3 each)
Firstly a method of converting from Y64 to a ordinal value (byte, integer etc) this function should except a single character and return an ordinal type, so you feed it the character ‘3’ and it returns the value of 5 – care should be taken to abort the decode if invalid (non-y64) characters are encountered) – below Y64toINT is function that performs this.
EncodedCol=Y64toINT( First Byte of Value)
Red is stored in bits 4 and 5
Green is stored in bits 2 and 3
Blue is stored in bits 0 and 1
In binary EncodedCol looks like:
Bits: 7 6 5 4 3 2 1 0
. . R R G G B B
red=(EncodedCol / 16) and 3
green=(EncodedCol /4) and 3
blue=EncodedCol and 3
red=(EncodedCol >> 4) & 3
green=(EncodedCol >> 2) & 3
blue=EncodedCol & 3
red=(EncodedCol shr 4) and 3
green=(EncodedCol shr 2) and 3
blue=EncodedCol and 3
red, green and blue will now hold a value in the range 0 to 3 – to give you a true colour, 24 bit version simply multiply red, green and blue values by 85 – they then will be in the range 0 to 255
the actual image itself now occupies the remaining 55 bytes, which comprises of 18 rows of 3 bytes each
the second, third and fourth byte represent the first scanline of the image, so for eg
if the 2nd, 3rd and 4th chars are “..z” (not including quotes)
then the values gained from the (Y64toINT function) would be 0, 0 and 63 respectively
which when represented in binary gives us: 000000 000000 111111 (the top scanline of our image)
the remaining 17 scanlines are stored in the same way.
Below is the first 7 scanlines of an image and its respective values and Y64 values
ScanLine 1st byte 2nd byte 3rd byte decimal values Y64 chars
0 000000 000000 111111 0,0,63 ..z
1 000000 000000 011111 0,0,31 ..T
2 000000 000000 001111 0,0,15 ..D
3 000000 000000 011111 0,0,31 ..T
4 000000 000000 111011 0,0,59 ..v
5 000000 000001 110001 0,1,49 ./l
6 000011 111110 000000 3,62,0 1y0
the remaining 21 Y64 chars would all be “.” For 0 (no forground image)
All variables are of type integer (at least 8 bits wide)
Let B=Bitmap(size 18 x 18 pixels) //Clear B to backgroundColour
//where value[I] returns the Ith character (1=first char, the colour byte) of the GLY
if c=0 then Let x=x+6 GOTO SkipInnerLoop
if (c and 32)=32 then Let B.pixel[x,y]=forgroundcolour
Let c=c shl 1
If xl<6 then GOTO InnerLoop
If xc<3 then GOTO SkipNewRow
If I<56 then GOTO MainLoop