- Mission Statement
- Project Feedback
Diagram SciFi Karaoke is an installation art project focusing on the human utterance. Basically, the karaoke installation consists of a "smart" microphone and a videoprojector that projects texts on the walls. The microphone should invite the audience to participate in the performance (Audience participation), inviting them to react to the projected materials or to other participants in whatever way they feel like (and are inspired to). The microphone is connected to hardware/software that takes the voice signal of the participants as input, processes it in various manners, and gives the processed signal back to the environment so that it becomes part of the performance. The audience will start to explore the response of the system, and the (sometimes unexpected) system response is supposed to trigger the communication between the attendants. The installation is planned to take place at the end of 2004.
Project and Objectives
The central objective of the project is to explore ways by which a system can react to properties of the human voice, most notably the emotional aspects of the voice and the (emotional and semantic) contents of utterances that are triggered by the context, and through its reactions enable forms of communication. Examples: With respect to the emotional aspects, the system might extract acoustic (pitch, volume, speaking rate) correlates of emotion, manipulate the speech signal in particular ways and synthesize utterances that react to the input utterances. With respect to the emotional and semantic content, the system might listen for particular words and generate relevant messages that act as triggers for further interaction.
Learning objectives concerning main Competencies (additional Competencies - Social and Cultural Awareness; Market orientation - may be addressed where applicable): 1. Emphasis in P0302 will be on Idea and Concpet Generation, more specifically on Brainstorming: students will explore several brainstorming techniques and learn to make a motivated choice of techniques give an particular purpose 2. With respect to Competency 3 (User Focus and Perspective), students will get an overview of methods for the identification of Requirements and experience how the Requirements constrain the concept generation stage. In addition, the research and design process makes the students familiar with particular aspects of communication. 3. With respect to Competency 2 (Integrating Technology), students will get insight into properties of human speech and of the human voice in relation to emotion, and apply signal processing techniques relating to speech (automatic speech recognition) and the human voice (signal manipulation techniques) 4. With respect to Competency 6 (Visual Language), students will learn to propose motivated solutions fitting the requirements: choosing expressive elements of a multimedia performance that implement the central objectives of the performance.
- Orientation stage: what the project is about. Outcome: initial scenarios
- Analysis/Research stage: Investigating basic aspects of communication and emotion; Identifying requirements; Exploring technology: getting acquainted with relevant aspects of technologies relating to the human speech and voice
- Design Stage: Developing concepts for the interaction scenarios; Identifying key technologies (Automatic speech recognition; speech manipulation techniques; software architecture); Refining the initial concepts
- Realisation stage: Building demonstrations of the concepts that have been selected for the scenarios
- Delivery stage: Provide a business model, identifying appropriate applications
- Working plan including definition of team roles and time schedule (Due end of first week)
- Presentation containing initial scenarios, concepts fitting the scenario and enabling technologies (Due: interim client meeting)
- Final demonstrator with backing information (embedding framework, motivation for choices) (Due: final client meeting
- A0 poster summarizing theproject, printed A4 report documenting the project, Digital CD contianing the presentations, final report and A0 poster (Due: end of period)
Ingeborg Houwen has a Master's degree Theaterwetenschap from the University of Amsterdam, with minors in Taalfilosofie, Filosofische Antropologie, Esthetica and Literatuurwetenschap. She is author, cultural entrepreneur, performer with a strong interest for new media, and the initiator of Diagram (see http://www.iice.nl/diagram-site/karaoke.html). She organized cultural exchanges and was one of the organizers of Cultural Center UI in 1997. Houwen wrote texts for the theatre, published in journals (including a serial in Folia Civitates) and is preparing a second novel. Contact: firstname.lastname@example.org
List of Available Resources/Experts
R.. Cowie, E. Douglas-Cowie et al. "Emotion Recognition in Human-Computer Interaction", IEEE Signal processing magazine Vol. 18, January 2001, pp. 32-80 (available through URL http://ieeexplore.ieee.org/xpl/tocresult.jsp?isNumber=19669)
- Suzuki.N., Y.Takeuchi and M.Okada: Psychological Effects Derived from Mimicry Voice Using Inarticulate Sounds, in R.Mizoguchi and J.Slaney(Eds.): PRICAI2000, Lecture Notes in Artificial Inteligence 1886, pp.647-656, Springer (2000)
Reference to similar projects: http://www.mis.atr.co.jp/~christa/NEWWORKS/Artworks.html[[BR]]
- Richard Appleby (TUE/Richard Appleby Design): design
- Jacques Terken (TUE): Communication/Technology/Human Computer Interaction
- Christoph Bartneck/Kees van Overbeeke (TUE): Emotional Design
- Panos Markopoulos: Requirements engineering
Feedback on Project Teamwork
It seems that the team worked together very well and they were rather happy with each other. I said “seems” because sometime the task division was not clear to me, although it might be clear internally inside the team. This brings an issue during the project: The team did not maintain active communication with coaches and clients, although I have been pushing the team doing so.
The team worked very hard towards a working prototype, especially in the second half of the project, but the workload was not balanced. The excuses of this could be that some of the team members were ill during the last two weeks.
Feedback on Design Process
The project went through idea generation, concept refinement and prototyping, but throughout the project, a careful plan was missing and the goal of the project was shifting around.
In the first half of the project, the team developed ideas and concepts by brainstorming, categorizing, and the team also did a lot of research on emotions and facial expressions, and the speech technologies as well. In the second half of the project, the focus was on how to use visual output to convey emotional information. But there seemed a faultage between the first half and the send half of the project: the research results and concepts developed earlier did not contribute that much to the final prototype.
Feedback on Project Objectives
In the second half of the project, the team focused very much how to give immersive video feedback to the speech input, but gradually shifted away from the original objective of the project: "to do something with speech recognition and emotions". Apparently lacking a good understanding of the project objective, the team did not develop an integrated and coherent vision of the different aspects of the situation they designed. The integration of speech technologies with the well demonstrated virtual reality could have made this project very compelling.
Feedback on Project Deliverables
Although the final presentation was not very satisfactory, the virtual reality demonstration with surrounding video and 3D graphics output was impressive. A working model as such needs a lot of effort and time. The hardware could not be available in place in time can also be an excuse.
Feedback on Presentations (interim and/or final)
The team presented six concepts: 1. Emotion is something you leave behind!" 2. "Emotions are to be collected!" 3. Characters 4. "Emotion is a medium of exchange!" 5. "Emotion is a choice!" 6. "Freeform-effect" controlled by hand in water. The clients were impressed by some of the innovative concepts, for example, the concepts of “emotions are to be collected” and “the freeform effect”
The final concept should be a combination of the above, addressing the use of the speech technology.
The demonstration was impressive, in the sense that it is more powerful than just a presentation, but it also made clear the weaknesses in the project – the role of the speech technology was not clearly stressed.
The team was very busy getting the demonstration work so that they didn’t get enough time to prepare the final presentation. The presentation was rather flat and did not get the audience really involved.
The final presentation could not give the audience a concrete idea how the demonstrated pieces could work together in terms of the project objectives. A use case or a scenario could have helped a lot.
The strong point of this project is the working prototype and the dedication towards such a prototype. The weak point is that the team had put too much effort and focus on the visual output (the virtual reality rather then integrating the speech technology. A good plan and a clear goal could have helped a lot to balance the effort and could have made this project very successful.
The team worked very hard and often give surprises to the client and the coaches; however, once again, there shouldn’t have been such surprises if the team had actively communicated the project process with the clients and the coaches.