G.A.T.E.
(Graphics Accessible To Everyone)
http://lsd.fi.muni.cz/gate
© Radek Ošlejšek
Faculty of Informatics MU
oslejsek@fi.muni.cz
 Even a visually impaired people can paint – http://pavla.wu.cz/
 Project goals:
– graphics accessible by means of a dialogue (dialogue is the basic concept
for manipulation with images)
– to develop utilities for picture annotation
– to provide blind users with support for investigating pictures
– to develop system for image generation, enabling the blind to easily
create some limited form of computer graphics
Motivation
Navigation
Navigation
 Recursive Navigation Grid
– Navigation backbone
– Recursive division of the picture space into nine identical sectors, analogously
to the layout of numerical keyboard
Semantics
Annotation concepts
 SVG (Scalable Vector Graphics) as a basic graphical format
– defines scene graph, i.e. structural decomposition of graphics
– based on XML – important feature for the integration of annotation
– supports vector graphics as well as raster images
– supports scripting and events handling
 Ontology
– defines structure and semantics of the world
– OWL (Web Ontology Language), XML-based as well
– formalization suitable for machine-learning and processing
– can simplify process of annotation
• e.g. by offering expected details (sub-objects)
– can deduce consequences
• e.g. swan + lake = swimming swan
– can check correctness of a picture
• required mainly for the automatic image creation, i.e. painting
Implementation
Ontology-based annotation
 Traditional annotation appends a text description to graphical elements
 Ontology-based annotation formally defines semantics of object by
ontology and then classifies graphical elements in the ontology.
 Mathematical formalism
 Structuring the knowledge
 Preventing chaos in terminology
 Multilinguality
 Generic OWL ontology does not restrict abstraction
Fig. Abstraction [Booch, G., et
al.: Object-Oriented Analysis and
Design with Applications.
Addison-Wesley Professional
(2007)]
Fig. Ontology-based annotation
Graphical ontology
 Ontology designed from the perspective of graphical objects,
their structure and visual properties.
– E.g. medicinal and poisonous plants are both meaningful ontology
categories. But are we able to define their visual characteristics,
distinguishing them?
 Predefined visual attributes are similar to those used in energy
distribution in 3D scenes, but transformed to approximate verbal
form
– E.g. viewDirection = {front_view, rear_view, left_view, ...}
Graphical ontology (cont.)
 OWL ontology prescribing important global visual characteristics and
constraints valuable for picture investigation by means of dialogues
 Visual aspects are organized in several classification types:
– Picture classification
• Properties related to the whole picture
• E.g. genre (photo, sketch, ...), central focus (dominant objects)
– Topological classification
• Compactness and topology of objects
• Compact objects = significant shape (human being, ...), non-compact
objects (sky, smoke, …), clusters (forest, crowd, ...)
– Geometric classification
• E.g. unusual size (big, narrow, ...), significant shape (oval, angular), etc.
– Color classification
• Properties related to illumination and colors.
• E.g. dominant color (ravens are black), light direction (dark silhouettes
due to a back light)
Annotation of raster images (photos)
A: an open fireplace with a fire,
B: seashore,
C: Genoa Lake,
D: Alps,
E: cloud sky,
A1: close vicinity of the fire, A2: the fire,
B1: sand shore, B2: seashore vegetation,
C1: the surface of the lake, C2 and C3: a swan on the lake,
E1: dark clouds, E2: light shining from behind the clouds.
Dialogue-based picture investigation
What-where Language
U: What is on the left?
S: On the left there is a small elephant.
U: Where is the elephant?
S: There are two elephants. One is on the left,
another on the right.
U: What is in the foreground?
S: In the foreground there is a big gray stone.
 What-Where Language, WWL
– Sentences of the form “WHAT is WHERE” or “WHERE is WHAT“
– Enables the user to ask simple questions about the objectsabout the objects and their positionposition
 Objects position
– Recursive Navigation Grid
 WWL example:
Dialogue-based investigation - example
S: This is a color photo of Genoa Lake from its shore. For more information, you can choose
Description, Technical information, use keywords or What-Where language.
U: Description
S: In the foreground, there is an open fireplace, shore and a small part of the lake. In the central
part, there is the shore near the left-hand part of the picture and the lake is approximately
in the right-hand part of the picture. In the background, there is a part of the lake, Alps and
a cloud sky.
/Comment: The approximate depth regions and the first semantic level is taken into consideration./
U: What is in the lake?
/Comment: WWL question, the second semantic level is taken into consideration./
S: There are two swans in the lake.
U: Where are the swans?
/Comment: Another WWL question./
S: Do you wish the approximate localization or use
Recursive Navigation Grid?
U: Approximate.
S: One is near the center of the picture, the second one
is horizontally near the right-hand side, vertically near
the middle.
Visitor GoF Pattern
Visitor GoF Pattern (cont.)
Implementation
 Tip na BP: Vytvořit sadu visitorů. Vhodné pro 1-2 studenty.
Sonification
Sonification
 Color is represented as a triple of primary colors
– e.g. red, green and blue (RGB)
 Sonification transforms color into the combination of sounds
– three primary colors have assigned sounds
– intensity of color component in sonified color drives loudness of produced
sound
– information about the intensity of three components usually does not help
the user to imagine the color
RGB color model HSL color model HSV color model
Semantic color model
 Semantic Color Model (SCM) is approximative model which
attempts to express a color as a combination of two well-known
colors.
– it is easier to imagine colors described in term like: yellow-green, gay with orange
shade, etc.
– it is easier to distinguish two sounds coding two well-known colors
U
V1
V2
d1
d2
x1
x2
Implementation
 PHP + Flash inside client
 Tip na DP: Přepsat v Javě, přesunout na stranu serveru (ustanovit
interface), testování použitelnosti (latence), optimalizace a úpravy
SCM, ...
Painting by means of dialogue
Result depends on experience
Paint Mona Lisa
??
Result depends on experience
Paint Mona Lisa
What is mona lisa?
Mona Lisa is
a smiling
woman
Knowledge base
Painting example
U: Put a comet in the sector 9.
U: Put a snowman into the bottom left corner.
U: Write the text „Merry Christmas and Happy New Year“ into the horizontal
center, color yellow.
U: Write the text „PF 2010“ into the bottom right corner, color blue.
U: Set background to snowflakes.
U: Generate.
Fig. The Chrismas card generated by a blind user
Implementation
 Tipy na DP/BP:
 Webové služby pro znalostní
bázi (přesun DB obrázků na
stranu serveru,
strukturované záznamy, …)
 Propojení generátoru s WWL
 VIM modul
 ...
Thank you for your
attention!