SignWriting Image Server

Open Font and Rendering Software

Beta 5c

Copyright 2007,2008,2009 Stephen E Slevinski Jr. Some Rights Reserved.
Except where otherwise noted, this work is licensed under
Creative Commons Attribution ShareAlike 3.0


Dedicated to my wife Sonia, to Valerie Sutton and to the entire SignWriting List:
their support and encouragement has always been there.
Dedicated to all lovers of the written word,
especially Mortimer J. Adler who taught me how to read a book.
There have been too many other teachers and friends to name them all.

Contents

Preface

Acknowledgments

1. SignWriting

2. Script

3. Binary SignWriting

4. Glyph

5. Glyphogram

6. Column

7. SymbolPalette

8. SignMaker

9. Project Files

Release Candidate 1


Preface

The primary purpose of the SignWriting Image Server is to display SignWriting images fast with a simple installation. The secondary purpose is to document and demonstrate sign language as text. SWIS requires a web server with PHP and the GD graphics library.

The vision of the SignWriting Image Server is to provide tools to view and edit Binary SignWriting. SWIS can and should be used to generate test data, verify image display, and model behavior for implementations with other programming languages on any platform. The SymbolPalette and SignMaker are currently available as part of SWIS. SignText is in development and will be available in Release Candidate 1.

This project is about open standards. We are using the GPL v3, the Open Font License, and Creative Commons (by-sa).

Acknowledgments

I undertook this project due to interest around the world, so that others can using the open SignWriting standards. These standards have evolved from the inspiration, dedication and hard work of Valerie Sutton, the Deaf Action Committee, and all of the various generations of sign writers, from 1974 till today.

1. SignWriting

SignWriting is a writing system for the world's sign languages. Learning the Art of SignWriting is beyond the scope of this guide. Please try SignWriting Home, SignWriting List, and SignPuddle Online

History

SignWriting was created by Valerie Sutton in 1974. Initially SignWriting was used to transcribe sign language. SignWriting evolved when the signers themselves started to write. Many changes were made to the script: the most notable are the switch from receptive to expressive, from horizontal to vertical, and the inclusion of lanes to account for body weight shift.

SignWriter DOS

From 1986 thru 1996, Richard Gleaves worked on SignWriter 1.0 thru 4.3. In 2000, Richard returned to complete SignWriter 4.4.

SignWriter was a breakthrough application for SignWriting because it not only made it possible to use SignWriting on computers, but it created a method for touch typing with sign language.

SignPuddle

Starting in 2004, I developed SignPuddle to bring SignWriting to the web. SignPuddle offered a new drag & drop method for SignWriting that quickly became popular. SignPuddle built on many of the ideas of SignWriter, utilizing the special commands for variations, fills, and rotations.

The International MovementWriting Alphabet (IMWA) was originally designed as a huge repository for the symbols of SignWriting and DanceWriting. SignPuddle offered unrestricted access to the entire IMWA. This made it possible to use a larger symbol set than was previously available.

Open Standards

SignWriting is used around the world. When the standards were still evolving, tighter control was needed to guide the development. Now that the standards are stabilizing and can address most any issue for any sign language, we realize that we need to adopt open standards so that SignWriting can spread even farther.

In 2007, Valerie undertook a huge effort to revamp and restructure the IMWA so that it focused exclusively on SignWriting. The resulting International SignWriting Alphabet (ISWA) is the most extensive and well organized symbol set in the history of SignWriting.

The ISWA 2008 is available under the Open Font License. This SignWriting Image Server project is available under the GPL v3.


2. Script

SignWriting is the universal script for sign language that is easy to use with either pencil or computer. SignWriting is used on a daily basis to write notes and literature. SignWriting has a simple and robust data representation.

Symbol

SignWriting uses many meaningful symbols organized in an ordered hierarchy. At the top of the hierarchy, there are 30 SymbolGroups. Each SymbolGroup contains a certain type of symbol: writing, punctuation, or sorting. Each type of symbol is used for a different purpose.

Each SymbolGroups has anywhere from 5 to 58 BaseSymbols. Each BaseSymbol can have up to a maximum of 96 valid symbols. Valid symbols only occur for both valid fills and valid rotations, based on BaseSymbol. There are 6 possible fill values and 16 possible rotation values.

Cluster

Glyphs (symbol images) are spatially combined to create a glyphogram (cluster of spatial symbols)

The major difference between SignWriting and other scripts is that SignWriting is spatial, and not sequential. Either left to right, or top to bottom, almost all other scripts put one character after another. The symbols of SignWriting are written spatially. Each sign, like a word, is written based on spelling rules. One writing symbol is used as the artistic center of the sign and the other symbols relate or revolve around that artistic center spatially.

The visual center is different than the artistic center. The visual center is based on the symbols used in the writing. The visual center of a hand shape is the absolute center of the image, while the artistic center would be the center of the palm. The visual center of a sign is the visual center of the sign image, taking an average of the min and max coordinate values. However, if a sign contains 1 or more centering symbols, only the min and max coordinates of centering symbols are considered, and the coordinates of all non-centering symbols are ignored.

Therefore, each sign is made up of writing symbols spatially arranged on a 2 dimensional canvas. The canvas has a defined visual center and a subjective artistic center. Neither punctuation nor sorting symbols are allowed in a cluster.

Column

Signs are stacked top down in columns based on their visual center. The columns are viewed left to right. Lanes are an advanced writing concept that describe body weight shifts. The signer physically moves their center. It is more than just leaning. There are three lanes available: left, middle, and right. When writing without lanes, the middle of the column is used to align the center of the glphograms. When lanes are used, the middle of the column becomes the middle lane. A right lane and a left lane are added at specific horizontal offsets. Glyphograms in either the right or left lane are shifted horizontally to align the lane and glyphogram center.

Sort

Sorting is achieved with the SignSpelling Sequence, hereafter called the sequence. The visible spelling is separate from the sort spelling used to order. A sequence can use either writing or sorting symbols, but not punctuation.

3. Binary SignWriting

Binary SignWriting is a character encoding model that uses a simple relationship between symbol id and character code to produce a double octet coded character set.

ISWA 2008

The ISWA 2008 is the abstract character set for Binary SignWriting. Each symbol of the ISWA 2008 has a unique symbol id, used as the character name. The symbol id is a 6 part number system. Example symbol id "01-01-001-01-01-01". This is a combination of "category" - "group" - "base" - "variation" - "fill" - "rotation". The meaning behind the symbol id is used to create the SymbolGroup data and the BaseSymbol data. This data enables 2 way encoding between symbol id and character code, validity checking of character code, and usability data for SignMaker and SignText.

The ISWA 2008 defines 30 SymbolGroups, 639 BaseSymbols, and 35023 symbols.

Character Encoding Model

The principles behind Binary SignWriting are simple. The coded character set is 7-bit ASCII compatible. The next 128 codes are reserved for special control codes, of which a minimal set is defined.

The first BaseSymbol is assigned a character code of 256. Each BaseSymbol can have a maximum of 96 symbols (6 fills and 16 rotations). The next BaseSymbol is assigned a character code of 256 + 96.

This encoding results in a starting code of 256 and a maximum code of 61599. A range of 61343 values for 35023 valid codes.

Number characters are defined as having unique character codes. This avoids character collision when parsing. The spatial aspect of sign language data requires coordinated information. Each spatial symbol is associated with 2 number character. The number characters have a range from -1919 through 1919. These number characters are used for the X,Y position of the top left of the symbol when placed on a 2 dimensional grid.

Data Stream

The data stream is a sequential list of 16 bit character codes. Each code is the same size.

ABNF Definition


Token Stream

The token stream is an alternate view of the data stream. Each codes is replaced with a single ascii character. There are only 9 characters used in the token stream: LBRwcnQsP. Case does not matter. Upper and lower cases are used to aid human scanning.

Token definition

Reqular expressions

Unicode Integration

Unicode is a big consideration for any character set. SignWriting imposes unique requirements on the model due to its spatial nature. Unicode uses a sequential character model. Sometimes, Unicode will use subscripts and superscripts to give the illusion of a spatial writing system, but the data is still a sequential list of characters without coordinate data.

The plane 4 solution makes it possible to convert sign language data encoded with Binary SignWriting into a Unicode string. An entire 16-bit Unicode plane is used for character mapping. The plane 4 solution was created after the sign language data model. All consideration was given to fulfilling the requirements of sign language data: using the ISWA 2008 with spatial information. The plane 4 solution is as simple as possible without consideration of Unicode's definitions and restrictions. There may be more complex solutions that fit Unicode's paradigm better; however, any solution must be able to represent the entirety of Binary SignWriting.

Just because we can represent sign language data with UTF-8 doesn't mean that we use Unicode internally. The binary data must fulfill the model and have an equivalent Unicode representation. Binary SignWriting is a robust encoding model that satisfied the requirements and is Unicode compatible.

PHP example

 function char2unicode($char){
   $code = hexdec($char);
   $a = $code%64;
   $b = floor($code/64);
   $c = floor($b/64);
   $b -= $c*64;
  
   $utf8[]  = "f1";
   $utf8[]  = dechex($c + 128);
   $utf8[]  = dechex($b + 128);
   $utf8[]  = dechex($a + 128);
   return "%" . implode("%",$utf8);
 }

JavaScript example

 function char2unicode(char){
   var code = hex2dec(char);
   var a = code%64;
   var b = Math.floor(code/64);
   var c = Math.floor(b/64);
   b -= c*64;
   a += 128;
   b += 128;
   c += 128;
   return '%f1'+'%'+c.toString(16)+'%'+b.toString(16)+'%'+a.toString(16);
 }

Hello World

4. Glyphs

The basic element of SignWriting is the symbol. Each symbol has a defined image called a glyph. The glyphs are accessed by character code.

PHP

The glyphs can be accessed using the glyph.php script. Glyphs always have a transparent background. It is possible to change the size of the glyph along with the color of the line or the fill.

Basic Example<img src="glyph.php?code=256">
Colorize<img src="glyph.php?code=256&colorize=1">
Line Color<img src="glyph.php?code=256&line=00ff00">
Fill Color<img src="glyph.php?code=256&fill=ff0000">
Size<img src="glyph.php?code=256&size=0.7">
Multiple<img src="glyph.php?code=256&size=0.7&line=00ff00&fill=ff0000">

JavaScript

The glyphs default to a colorized line and a white fill. A custom ISWA 2008 extract is needed for black and white and reduced size.

5. Glyphogram

The SignWriting glyphs are used spatially, not sequentially. A simple X,Y coordinate system is used to arrange the glyphs in space. This results in a single image of spatially arranged glyphs called a glyphogram.

Glyphogram Data Stream

The glyphogram data stream is defined as a cluster in Binary SignWriting. It is a series of repeating writing character codes with X,Y coordinates. A token stream can be defined as a cluster = ([wc]nn)*

Consider the following Binary SignWriting: 00801ce0f8d4f8ec8746f8eaf8fe

CharTokenValue
0080BSIGN_MARKER
1ce0w
f8d4nX = 85
f8ecnY = 109
8746w
f8eanX = 107
f8fenX = 127

PHP

Basic Example<img src="glyphogram.php?bsw=0080...">
Colorize<img src="glyphogram.php?bsw=0080...&colorize=1">
Size<img src="glyphogram.php?bsw=0080...&size=0.7">
Line Color<img src="glyphogram.php?bsw=0080...&line=ff0000">
Fill Color<img src="glyphogram.php?bsw=0080...&fill=ff0000">
Multiple<img src="glyphogram.php?bsw=0080...&size=0.7&colorize=1">

6. Column

The column.php script is used for writing text in lanes. Unlike the glyph and glyphogram, a column has a defined height.

Consider the following BSW:
Data Stream: 00801ce0f8d4f8ec8746f8eaf8fe008032eaf89ef91432e1f897f8fb5ec0f8b1f901b3b4f8a9f8e5ec20
Token Stream: BwnnwnnBwnnwnnwnnwnnP

PHP

Basic Example<img src="column.php?bsw=0080...">
Colorize<img src="column.php?bsw=0080...&colorize=1">
Size<img src="column.php?bsw=0080...&size=0.7">
Line Color<img src="column.php?bsw=0080...&line=ff0000">
Fill Color<img src="column.php?bsw=0080...&fill=ff0000">
Multiple<img src="column.php?bsw=0080...&size=0.7&colorize=1">

7. SymbolPalette

Organization

The SymbolPalette is organized using a 6 by 16 grid. The top layer of the SymbolPalette displays the SymbolGroups. The middle layer displays the BaseSymbols of the selected SymbolGroup. The bottom layer displays the valid symbols of the selected BaseSymbol.

Function

The SymbolPalette has 2 functions: clicking and dragging. Clicking on a SymbolGroup will refresh the SymbolPalette to display the BaseSymbols for that SymbolGroup. Likewise, clicking on a BaseSymbol will display the symbols for the BaseSymbol.

Any symbol can be dragged from the SymbolPalette. The symbols can be dropped in a SignBox or a SymbolList, described below in the SignMaker section.

8. SignMaker

SignMaker contains 4 main parts: the SymbolPalette, the SignBox, the special commands, and the SignSpelling Sequence. The SymbolPalette is described above.

SignBox

The SignBox is the large empty square. Symbols from the SymbolPalette can be dropped in the SignBox. Selected symbols appear in blue. It is possible to select multiple symbols by clicking and dragging accross the SignBox, which will select all symbols between the start and end of the drag. Holding the shift key while clicking will also allow multiple selection of symbols.

Special Commands

The special commands are below the SignBox.

SignSpelling Sequence

The SignSpelling Sequence, hereafter called the sequence, is the top right square. The sequence is a sequential list of symbols used for sorting. Symbols can be dropped from the SymbolPalette or the SignBox. The symbols in the sequence can be rearranged by dragging them around. The symbols can be removed from the sequence by clicking. Blank symbol boxes in the sequence are ignored.

9. Project Files

Data Files

There are 2 data files: iswa/iswa.sgd for SymbolGroups and iswa/iswa.bsd for BaseSymbols. The basic use of these files can be found in iswa.php.

SymbolGroup Data

BaseSymbol Data

Font Files

Valerie Sutton has created over 35k individual PNG images for the ISWA. Each symbol had a png file named after the symbol id. These images have been sorted by BaseSymbol character code, reformatted for standard color & reduced file size, and renamed to the character code. The files are in the iswa subdirectory.

License Files

In the root directory, the file COPYING.txt is the GPL v3 file.

In the iswa subdirectory, there are 2 files regarding the Open Font License: OFL.txt and OFL-FAQ.txt.


Release Candidate 1

Release Candidate 1 is expected in July 2009.
END