Thursday, December 8, 2011
The sub-set of AIML
I've got the list of files down to 18 files containing some 17800 categories. The next step is to go through these categories in much more detail. What am I looking for? Anything that doesn't match the character profiles of Atomic and Romeo. If it suits one of them it will still have to go.
Why is less AIML better than more? And why am I making this decision? The need to avoid Atomic and Romeo having too much in common is a part of the logic. Another part is less obvious. It is beyond difficult (if not impossible) to keep any overall image of the possible relationships between categories. Less is better. Another part comes from the scriptwriter in me and is even less obvious. How would I edit a script? Look for the crucial moments, those line of dialogue the audience simply must get to generate closure. Everything else is up for grabs.
What is the absolute least? I don't know. However, Atomic and Romeo will have their own 'Properties' defined inside the Pandorabots server rather than in an AIML file. This is where I'll set the 'facts' of their back story.
Now back to the categories.
Why is less AIML better than more? And why am I making this decision? The need to avoid Atomic and Romeo having too much in common is a part of the logic. Another part is less obvious. It is beyond difficult (if not impossible) to keep any overall image of the possible relationships between categories. Less is better. Another part comes from the scriptwriter in me and is even less obvious. How would I edit a script? Look for the crucial moments, those line of dialogue the audience simply must get to generate closure. Everything else is up for grabs.
What is the absolute least? I don't know. However, Atomic and Romeo will have their own 'Properties' defined inside the Pandorabots server rather than in an AIML file. This is where I'll set the 'facts' of their back story.
Now back to the categories.
Decision time
I spent yesterday going through the generic set of AIML, the aiml-en-us-foundation-alice.v1-5 set. This was not a easy task; there are 64 files containing over 93000 categories!
For each of the files I skimmed through the AIML categories. I was looking for the relationship between the and the tags. This gave me a feel for what the file was used for, its overall purpose, content and themes. This is like a (shoddy) mix of content and discourse analysis. I 'coded' the files by their purpose. Today I will use this as the basis for making the decision on what stays and what goes.
The core consideration in making the decision about which files stay and which go is character. Does this file or category suit Atomic and Romeo? It has to suit both as this edited set will be common to both characters.
There are some examples of very clever uses of AIML. For example: telling jokes that require the chat-bot to remember the last thing it said and then deliver the punch-line after the client types in WHAT; doing some simple maths; and creating stories using celebrity names randomly chosen from a list that are added to existing story lines.
The automated story telling is interesting. It is akin to theatre games where the audience, using some random device like dice or a chocolate wheel, sets the location, characters and theme for actors to improvise around. Footlice Theatre Company in Newcastle used to to a parody of Melrose Place called Hellrose Place using these improvisational devices - performing in a pub with no 'safety net'!
When I'm finished creating the edited set I'll post a list of the remaining files.
For each of the files I skimmed through the AIML categories. I was looking for the relationship between the
The core consideration in making the decision about which files stay and which go is character. Does this file or category suit Atomic and Romeo? It has to suit both as this edited set will be common to both characters.
There are some examples of very clever uses of AIML. For example: telling jokes that require the chat-bot to remember the last thing it said and then deliver the punch-line after the client types in WHAT; doing some simple maths; and creating stories using celebrity names randomly chosen from a list that are added to existing story lines.
The automated story telling is interesting. It is akin to theatre games where the audience, using some random device like dice or a chocolate wheel, sets the location, characters and theme for actors to improvise around. Footlice Theatre Company in Newcastle used to to a parody of Melrose Place called Hellrose Place using these improvisational devices - performing in a pub with no 'safety net'!
When I'm finished creating the edited set I'll post a list of the remaining files.
Tuesday, December 6, 2011
Some elementary issues
This post is a device. The purpose of this device is to force me to clearly articulate some elementary issues in the development of character and AIML. So, please excuse me if the following contains more questions than answers and throws seemingly disconnected ideas around... Such is my want...
1. The common back story problem
Atomic and Romeo need to share a common back story - they are friends rather than strangers. Cool. Simply give them some common AIML categories... That sounds easy...
2. The too much recursion / dead-end problem
There are on YouTube, and other places, examples of chat-bots talking to each other (or of a chat-bot talking to itself). For instance, look for Fake Kirk talking to ALICE - they use the same ALICE 'engine' and AIML architecture as I am using. These conversations are pretty banal - strewn with non-sequiturs and odd recursions. Is this as good as it can be? If so I've got real problems in this project.
I have a hypothesis. When the two chat-bots share too much AIML this triggers a lot of recursive patterns or leads to conversational dead-ends. As an example, both ALICE and Fake Kirk respond to the "Do you like science fiction?" question in exactly the same way. Neither chat-bot actually knows how to reply to the answer "Yes I love it, especially the works of Philip K. Dick." Both get hung up on the "it" - what does 'it' refer to?
What does this mean in practice? Atomic and Romeo need to share some but not too much AIML. I think this balancing act will be difficult to achieve. Maybe this balance will emerge from the writing and rehearsal process? Depending on emergent phenomena is a risky business.
3. The start point problem
Basically I've got a choice of two starting points.
Option One - I could start with a completely blank slates. Atomic and Romeo as they currently exist know nothing - even less than Sargent Schultz! I could write a completely unique set of each of them.
Option Two - I could start with a generic set of AIML, for example the aiml-en-us-foundation-alice.v1-5.zip set. This is "a free AIML set, in the English Language as spoken in the United States, authored by the AI Foundation".
The advantage of Option Two is that it maintains some serendipity in the interactions that could take me places that I would not think of going. Rehearsing with actors who don't even know there is a script would be tedious in the extreme - I would have to craft every single and . I want these guys to have some independence from me as the writer. Also, it takes care of the simple, but completely necessary, basics of conversation - greeting and the like. The downside is the editing. There are a lot of categories that relate to US politics and other issues as well as US spelling and idioms.
Some conclusions - so far...
The plan is to go with the V1.5 version of Alice.
Edit the set ruthlessly - strip it right back.
Generate a common set that I can use for Atomic and Romeo.
Then, in rehearsal, add to their individual sets.
Hopefully this will allow the balance between what the share in common and what is unique to each to emerge.
1. The common back story problem
Atomic and Romeo need to share a common back story - they are friends rather than strangers. Cool. Simply give them some common AIML categories... That sounds easy...
2. The too much recursion / dead-end problem
There are on YouTube, and other places, examples of chat-bots talking to each other (or of a chat-bot talking to itself). For instance, look for Fake Kirk talking to ALICE - they use the same ALICE 'engine' and AIML architecture as I am using. These conversations are pretty banal - strewn with non-sequiturs and odd recursions. Is this as good as it can be? If so I've got real problems in this project.
I have a hypothesis. When the two chat-bots share too much AIML this triggers a lot of recursive patterns or leads to conversational dead-ends. As an example, both ALICE and Fake Kirk respond to the "Do you like science fiction?" question in exactly the same way. Neither chat-bot actually knows how to reply to the answer "Yes I love it, especially the works of Philip K. Dick." Both get hung up on the "it" - what does 'it' refer to?
What does this mean in practice? Atomic and Romeo need to share some but not too much AIML. I think this balancing act will be difficult to achieve. Maybe this balance will emerge from the writing and rehearsal process? Depending on emergent phenomena is a risky business.
3. The start point problem
Basically I've got a choice of two starting points.
Option One - I could start with a completely blank slates. Atomic and Romeo as they currently exist know nothing - even less than Sargent Schultz! I could write a completely unique set of each of them.
Option Two - I could start with a generic set of AIML, for example the aiml-en-us-foundation-alice.v1-5.zip set. This is "a free AIML set, in the English Language as spoken in the United States, authored by the AI Foundation".
The advantage of Option Two is that it maintains some serendipity in the interactions that could take me places that I would not think of going. Rehearsing with actors who don't even know there is a script would be tedious in the extreme - I would have to craft every single
Some conclusions - so far...
The plan is to go with the V1.5 version of Alice.
Edit the set ruthlessly - strip it right back.
Generate a common set that I can use for Atomic and Romeo.
Then, in rehearsal, add to their individual sets.
Hopefully this will allow the balance between what the share in common and what is unique to each to emerge.
Been away a while
The arrow of time is really pointy - you know what I mean. Anyway over the past months I've been concentrating on writing sections of my exegesis and, where possible, converting these to journal articles. I've now got a pretty good handle on the overall structure of the exegesis. The sections I've already written are long but this is a good starting point for editing and shaping.
In early January I'm off to Los Angeles for a couple of conferences - the Eighth International Conference on Technology, Knowledge and Society and then the Sixth International Conference on Design Principles and Practices. The papers for these have been written and submitted. Hopefully the publishing process will not cause any grief.
In February I'm delivering a paper at the 18th Australasian Humour Studies Network Colloquium in Canberra. Last year it was held in Hobart. A damn fine event that attracted academics and practitioners - this year promises to be bigger and better.
The next couple of posts will be about AIML development issues. This is my core task now.
In early January I'm off to Los Angeles for a couple of conferences - the Eighth International Conference on Technology, Knowledge and Society and then the Sixth International Conference on Design Principles and Practices. The papers for these have been written and submitted. Hopefully the publishing process will not cause any grief.
In February I'm delivering a paper at the 18th Australasian Humour Studies Network Colloquium in Canberra. Last year it was held in Hobart. A damn fine event that attracted academics and practitioners - this year promises to be bigger and better.
The next couple of posts will be about AIML development issues. This is my core task now.
Monday, July 18, 2011
A better diagramatic view
Here is an updated version of the diagram. I was concerned by the Ptolemaic positioning of the playwright at the center of the production universe.
The structures remain unchanged but this diagram suggests that these structures are internalized within the playwright. I think this better describes the process and accords with the work of Bourdieu (habitus and field of works) and Csikszentmihalyi (domain acquisition and field).
(Note - This is version 5 of the diagram. As I've been writing the associated section of the literature review I've refined the titles used on the sections.)
Thursday, July 14, 2011
People finding a paper of mine
I have an account on Academia.edu - the professional networking site for academics (http://newcastle-au.academia.edu/MichaelMeany). Recently, over the past month or so, I've been getting a number of notifications saying that people from all over the world have been finding my work - usually through Google searches.
In particular, the paper that has been attracting the most interest is one I delivered at a conference in 2007 titled 'Humour, Anxiety, and Csikszentmihalyi's concept of Flow'. ( http://www.inter-disciplinary.net/ati/education/cp/ce3/meany%20paper.pdf )
I'm certainly not saying this is the academic must-read of the season, however, it does suggest that the title of this paper, if nothing else, has struck a chord in the zeitgeist. The combination of humour theory and creativity theory looks to be a rich field.
In particular, the paper that has been attracting the most interest is one I delivered at a conference in 2007 titled 'Humour, Anxiety, and Csikszentmihalyi's concept of Flow'. ( http://www.inter-disciplinary.net/ati/education/cp/ce3/meany%20paper.pdf )
I'm certainly not saying this is the academic must-read of the season, however, it does suggest that the title of this paper, if nothing else, has struck a chord in the zeitgeist. The combination of humour theory and creativity theory looks to be a rich field.
A diagramatic view of the project

Last night I had an epiphany... It was one of those rare moments of clarity. I was looking for a way to describe the relationships between the playwright, the scripted dialogue and the interface (will all the attendant technology).
A playwright's job is far more than simply writing the words to be delivered. As the suffix 'wright' suggests, the job requires an understanding of the entire production process, just as a shipwright needs to more than a woodworker or welder to build a ship. The playwright needs to grapple with technology and industry of production - what is possible on a live theatre stage and how it can be achieved. For example, there is no point attempting to stage a film script that requires truckloads of CGI (imagine trying to stage Avatar).
The diagram here represents the structural layers of the project - each of which offer an unique level of agency. The further from the centre the less direct influence the playwright has. This is not to say that at the outer edge the playwright is completely powerless - at this point the playwright is still making choices but they are largely constrained by the structure. In this project I will depending upon the 'kindness of strangers' - the good people who develop and distribute web browsers. Without them the project would not have a 'theatre'. In production, a play needs to be adjusted to fit the stage and these choices are examples of the agency of the director and playwright. They are choices made within a constrained structure.
Skipping to the inner circles, the structure nearest to the playwright is the AIML of the two characters - Atomic Playboy and Radiation Romeo. The situation here is almost the reverse of the outer shell. The structure of AIML, based on XML, is so open that the characters can be scripted to say anything. The choices here, for the playwright, are those that any playwright would face - what to leave in and what to leave out.
The playwright at the core of the project is not an 'atomic' unit, not indivisible. Rather, the playwright is a construction of a whole other set of contributing elements - habitus, domain acquisition, writing experience, personal preference, genetics etc. These things I'll be dealing with in other sections of the literature review.
Subscribe to:
Posts (Atom)