Saturday 27 September 2014

Hidden human experience in a Turing test: Two vets from London meet the Turing machines

Guest Post by 'hidden humans' in Session 6 Turing test experiment, at The Royal Society, London on Saturday 7 June 2014.

On a warm sunny afternoon in June, two vets from London (a husband and wife team) quizzically entered The Royal Society Building on Carlton House Terrace to take part in the Turing test.  Whilst scaling the flight of stone steps from the Mall, surmounted by the statue of Frederick Augustus, "the grand old Duke of York", we chatted about how many Fellows of The Royal Society we could identify and then concluded we didn’t really know much about the history and maybe Sir Isaac Newton was a fellow? Sadly we only discovered that Alan Turing was himself a Fellow after leaving the building!

We were honoured to be invited by a friend who was part of the organisational team to participate in a Turing Test on 7th June 2014, a particularly poignant date as it took place on the 60th anniversary of Turing's death, nearly six months after he was given a posthumous royal pardon. We had heard much about artificial intelligence and the work by Professor Kevin Warwick, a Visiting Professor at the University of Reading and Deputy Vice-Chancellor for Research at Coventry University, and his team and wondered whether the people involved in this field who look particularly cyber like or whether Professor Warwick would show us his implanted microchip? We had also heard that actor Robert Llewellyn, who played robot Kryten in the sci-fi comedy TV series Red Dwarf, was one of the judges so being fans, hoped to glimpse him and wanted to see him in costume. (He wasn’t).

The instructions on arrival for the test were cloak and dagger like and we were under strict instructions not to reveal to other people in the building that we were hidden humans – but this was overcome at the registration desk when a member of the team treated us like VIPs. We feared our cover had been blown as he ushered us into a waiting lounge where some other people (we presumed of the general public) were watching some of the live Turing test conversations on a large screen. There were numerous stands from the universities which were taking part in the project and so we read with interest the AI projects being studied by both under and post graduates and were pleased to see that robots were being developed within the medical field to help humans, rather than take over the planet. At our allotted session time we were ushered into the Test room and designated a PC each, there were around 6 PC terminals in a very grand room; these modern ICT devices did appear slightly out of place with the rest of the historic nature of the building and the very grand (and dark) oil paintings of famous fellows which adorned the walls. The physicist and mechanical engineer, James Watt, was peering down at us as we typed our answers....

After meeting several other of the hidden humans (who included a jazz musician) and Professor Warwick (who also appeared human and not cyborg like) we were invited to take refreshments either before or between test sessions. Being aware of the importance of hydration status and keeping glucose levels up to maximise our own concentration and performance it seemed rude not to partake of the feast of sandwiches and tasty cakes, scones, clotted cream and tea/coffee!

Despite being sent instructions and previous reference material about what the Turing Test consisted of prior to our visit (and also following some blogs on line), we hadn’t really truly grasped what our role would be. So in our minds “We hidden humans would be talking to computers and someone else (a judge) would be trying to decipher the conversation between the two of us and would then decide who was the AI and who was the Hidden Human” So please bear this in mind when interpreting our answers, we honestly thought we were communicating with robots and got quite irritated at times at some of their silly questions, slowness in replying to typed answers (especially considering we thought they had gigantic IQs) and deliberate spelling mistakes!
The question sessions took on several themes, quite a few of them focusing on being a student from Russia and what we could recommend to do as tourists in London? Some questions referred to holiday travels so reminded us of trips to the hairdressers! Many of the questions were plain boring and lost our interest pretty rapidly, so we turned the interview process around and started to question the “thing” (who was actually a judge) who was communicating with us. That could turn out to be entertaining and irritating at the same time.

More interesting questions included:
·         Should all drivers over the age of 70 be banned from driving?
·         Ethical debate about treatment options for genetic conditions?

Irritating questions included those with only a Yes / No answer, those relating to Boris (not sure if this was the mayor of London or the visiting Russian student), persistence in trying to obtain our IDs (which we were instructed not to give away), very childlike spelling and lack of comprehension of words or questions we typed back. The “thing” did not appear to understand irony or sarcasm and would frequently report back that “they did not understand”. However it was equally frustrating when they responded with typographical errors which we could understand, but were left in a quandary as to whether an immediate response would indicate that we were in fact hidden humans?

The experience did help us evaluate ourselves as to “what makes us human?” Is it the fact we have emotions and the ability to respond to ethical dilemmas? What constitutes cognitive processes and ability to reason? Humans and robots can both make errors but do both have the ability to attempt to deceive each other? Time flew by and we were sad to find the Turing Test session was over. We left the team processing the results and will be interested in finding out our own scores. We think they guessed we were human from our answers, but interestingly we thought the judges were robots! This is not the aim of the Turing Test at all and perhaps Alan already knew the answer to the reverse of his question i.e. If a human is mistaken for a computer more than 30% of the time during a series of five minute keyboard conversations it passes the test? Perhaps this happens more frequently than we know it in our ever expanding world of IT technology? We concluded telephone calls and face to face communication is always the best and hiding behind keyboards can have its own drawbacks

---
Update 9 October 2014:

For another view from 'the hidden human' see Steve Battle's blog post 'I was a hidden human, Or: How Eugene goosed man' here: http://battle-bot.blogspot.co.uk/2014/08/i-was-hidden-human.html

What needs to be pointed out is that each hidden human chatted to one human judge only in 5 tests. The hidden humans never chatted to Eugene or the other four machines in any of the five tests each participated in as a 'foil for the machines'. Each judge simultaneously chatted to five pairs of hidden entities, one human and one machine. The judges' task was to unmask the machines and recognise humans. The hidden humans' task was to be themselves and not give up their identity. The machines' purpose was to give satisfactory responses to any and all the questions/statements the judges put to them in five minutes.

Huma Shah

Tuesday 10 June 2014

Eugene Goostman machine convinced 33.33% of a new set of Judges at The Royal Society 6-7 June 2014, following its 29.17% performance at Bletchley Park in 2012

Eugene Goostman has done it again! *No, it has bettered what it did previously: the machine entry convinced one third of 30 Judges that it was human in Turing tests carried out at The Royal Society London, 6-7 June 2014.

The Turing test event was independently adjudicated by Professor John Barnden of Birmingham University, supported by RoboLaw project's Dr. Fiorella Battaglia and Dr. Federica Lucivero.

For the record, at The Royal Society London  Vladimir Veselov, Eugene Demchenko and Sergey Ullasen, the team behind machine entry Eugene Goostman, surpassed their previous record, convincing 25% of the Judges, at Reading University in 2008 increasing that to convincing 29.17% of the Judges at Bletchley Park in 2012 that Eugene was a human - see the New Scientist report by Celeste Biever, one of the Judges in 2012 here:

"Bot with boyish personality wins biggest Turing" test: http://www.newscientist.com/blogs/onepercent/2012/06/bot-with-boyish-personality-wi.html

[NB: IBM's Watson team headed by David Ferrucci declined to take part in Turing100 at Bletchley Park in 2012. Watson is a reverse question-answer deep-search system and not built for making human conversation.]

For clarification, in 2012 two versions of Turing's Imitation Game were implemented:

1) The simultaneous comparison Turing test in which a human Judge interrogated two hidden entities at the same time

2) The viva voce Turing test in which the human Judge interrogated one hidden entity at the same time

The purpose of the 2012 Turing test experiment was to find which of those two versions of the Imitation Game was the tougher for the machines, it turned out to be 1) above  - the Turing100 at Bletchley Park team reported at a conference [ ICAART 2014 in Angers] - you can view that presentation, 'Fundamental Artificial Intelligence and Machine Performance in Practical Turing tests' here.

Other publications from the 2012 Turing tests include:

Assumptions of Knowledge and the Chinese Room in Turing test Interrogation

Good Machine Performance in Turing's Imitation Game

Human Misidentification in Turing tests


For 2014 the Turing test team from Reading University's School of Systems Engineering and Professor Kevin Warwick (Deputy Vice Chancellor-Research, Coventry University & Visiting Professor Reading University) and Dr. Huma Shah (Research Fellow, previously on RoboLaw and now at the Future Institute, Coventry University) implemented the simultaneous comparison Turing test at The Royal Society, 6-7 June 2014.


A simultaneous-comparison Turing test in which a Judge interrogates a human and a machine in parallel
- image created by Reading University's Chris Chapman


30 Judges, 30 Hidden humans and 5 machines took part across 150 simultaneous comparison Turing tests.

The human participants (judges and hidden humans) were drawn from members of the public and came from all walks of life as well as academia included some who flew in self-funded especially from across Europe, from the US and Russia:

-males and female
-adults and teenagers
- native and non-native English speakers
-celebrity
- a member of the House of Lords
- experts (computer scientists, mathematicians, human language experts), and non-experts (a vet, students, journalists)

The 5 machines involved in 2014 (and also in the 2012 Turing tests at Bletchley Park) were:

Cleverbot - created by Rollo Carpenter

Elbot - created by Fred Roberts

Eugene Goostman - created by Vladimir Veselov and Sergey Ulasen

JFRED - created by Robby Garner

Ultra Hal - created by Robert Medeksza



YouTube of Session 5 Judges in City of London Room 1, and Visitors watching the conversations live on TV screens in the Wellcome Trust Hall in The Royal Society London, 7 June 2014, here:
https://www.youtube.com/watch?v=zgqetyY-_5U&feature=youtu.be


YouTube of Session 6 Judges in City of London Room 1 in The Royal Society can be seen here:
https://www.youtube.com/watch?v=5PBcTK8NnMs



Pictures from The Colonnade Hotel Maida Vale (was Paddington Lodge when Alan Turing was born there in June 1912):
Add caption

Oh dear my feet got in the way!



Hotel foyer




Turing exhibits in the hotel


Turing plaque at the Colonnade Hotel

Views from my room, above & below in The Colonnade Hotel





One of the best Turing2014 Team memories is staying at Turing's birthplace for one night, before the main/public event, and strolling around Little Venice after a great dinner and discussion with Turing2014 team members.



Pictures from the Royal Society and the Turing tests:


Alan Turing FRS




Reading University, School of System's Engineering RoboLaw Turing test event in the majestic Royal Society London


Judges, hidden-humans and machine developers after Session1

Robotics Professor, Martin Smith, with RoboLaw's Dr. Fiorella Battaglia





Turing test conversation: Session 1


Turing2014 extended team: Lunch on main event day: Saturday 7 June 2014

Hidden humans in Session 5_June 7 2014
Hidden human in Session 5_June 7, 2014
Reading SSE's Nellie Round, and Mr. Geoff Round
Turing2014's Chris Chapman watches over as Judges interrogate


Hidden human in Session 5


Visitors reading the simultaneous Judge+2 hidden entity conversations in the Wellcome Hall


Visitors included Robert Medeksza  (in light blue shirt) of Ultra Hal machine competing in the Turing tests


Visitors crowd around 2 big TV screens displaying simultaneous conversations
Research scientists, Chris Knight and Charlie Moorey at The Royal Society



Judges in City of London Room 1

Alan Turing Year's Professor S.Barry Cooper
Simultaneous comparison Test: Judge's screen


Turing2014 team's Sam Denning and Professor Barry Cooper

City of London Room 1: Judge area


Judge reading information before scoring hidden conversationalists


Dr Federica Lucivero and Professor John Barnden (middle right in hidden-human area)

Dr Fiorella Battaglia next to RoboLaw project documents display


Lunch for the Turing2014 team:

Lunch in City of London Room 2: Saturday June 7


Lunchtime in the Control Room: City of London Room 3: Professor Warwick (left), Mrs Irena Warwick, and Lead Independent Adjudicator, Professor John Barnden of Birmingham University


Professor and Mrs Warwick, Professor John Barnden in experiment control room area: lunchtime

Visitors read the conversations as they happen on two big TV screens
© Albert Efimov


Watching the simultaneous conversations in the Wellcome Hall

Human or machine? Can the Visitors decide!

Reading University School of Systems Engineering student Judging in Session 5

Extreme right: Professor Warwick talking to Fred Roberts of Elbot machine competing in the Turing tests



Professor Aaron Sloman, Turing test Judge at The Royal Society London, 7 June 2014

Journalist Judging in Session 5
From left: Mark Allen (MATT technical support), Professor and Mrs Warwick, + 2 Hidden humans

Hidden human area, Professor Warwick explaining purpose

Visitors and participants relaxing in the Wellcome Hall
Session 6 Judges

Professor Warwick (& Dr. Michael Barclay-visual Turing test, seated)

Score announcement audience in the Wellcome Hall

From left: students from School of Systems Engineering; Eleanor and Emily from Reading University's Events team


2014 Turing test event part of EU RoboLaw 'Emerging Technologies' project


RoboLaw: literature on EU project on emerging technologies behind the event


Turing2014 mug for participants and visitors



RoboLaw- other side of Turing2014 mug

The commemorative bags: forgot to save one for myself!



After the last, 6th Session of the Turing test event, on Saturday 7 June 2014 all the Judge scores were independently checked and verified by Professor John Barnden of Birmingham University, and Dr. Fiorella Battaglia and Dr. Federica Lucivero from the EU RoboLaw project.


Each of the Invited Machine Developers was awarded a one-off RoboLaw/Virtual Robots trophy:
Team behind Eugene Goostman machine: from left:
Igor Bykovskih, Sergey Ulasen, Vladimir Veselov, Andrey Adashchik




Robby Garner_JFRED with his trophy


Fred Roberts_ELBOT with his trophy


Robert Medeksza_Ultra Hal with his trophy



One third of the 30 Judges were convinced by Eugene, scoring it as a human. This does not make the human Judges dumb, it is evidence of the Role of Error-making in Intelligent Thought  - that smart people can be deceived quite easily. Professor Warwick commented:

 "Having a computer that can trick a human into thinking that someone, or even something, is a person we trust is a wake-up call to cybercrime. The Turing Test is a vital tool for combatting that threat. It is important to understand more fully how online, real-time communication of this type can influence an individual human in such a way that they are fooled into believing something is true...when in fact it is not." 

From here.

Finally, comments from some of the people who were there at the historic event -:

Via email:

"Thanks once again for a wonderfully organized event. They get better and better! J
... it was great to be included, especially in such an historic result. I think it will certainly shake up the Turing test debates for quite a while."


 " I would just like to personally thank you for all the effort you clearly put into the event on the 7th and to thank you for allowing me to be involved. It was a fantastic day with a great outcome and I enjoyed every minute. Artificial intelligence is a  fascinating field and a potential world changing technology. I would be happy to be involved or help out in any way possible for future events projects or anything else related to AI."

"Thank you for letting me take part last Saturday. It was great and I really enjoyed it."

"I am so pleased that the event went well."

"Thank you very much for the excellent event and letting me be a part of it. It was thoroughly enjoyable. I’ve been showing off with the items in the goody bag as colleagues had heard about the event on Friday through media. "

"to participate in last weekend's Turing Tests; it was a real privilege to be involved in some way - they've certainly caused a stir."

"take part in the Turing Test. It was very interesting and my grandson ... realy loved it. 
I will be interested to eventually find out the results as he was pretty sure he knew at least one of them was a robot."

Via text message:

"That was amazing"

Twitter comments:


Debunking Eugene: Montreal cognitive scientist Stevan doubts UK university's test claim


. RT : The Test Is Not A Gallup Poll (And It Was Not Passed)

Congratulations on great test experiment 6-7 June 2014 stimulated great discussion!
it was a real pleasure. I was honoured to be involved at all even in such a small way.
Blurring the lines - Eugene is the first program to pass the Turing Test
Celebrity judges and a "child" computer? Is this some kind of joke? Disgraceful.
is this an Ubuntu screen?
What an amazing day and great to meet up with people that I normally only speak to online. Well done to all.
when are you going to release some transcripts? Thanks
Got a link to the transcripts? Or sample chat from the Turing Test winner?



Newspaper and online magazine reports of the Turing test event include:

Slate article by David Auerbach pointing out what the Turing test actually isn't:

"Hunch CEO Chris Dixon tweeted, “The point of the Turing Test is that you pass it when you've built machines that can fully simulate human thinking.” No, that is precisely not how you pass the Turing test. You pass the Turing test by convincing judges that a computer program is human."  

Professor Kevin Warwick on 'How the Turing test was won and why it matters' in The Independent Voice :
http://www.independent.co.uk/voices/comment/how-the-turing-test-was-passedand-why-it-matters-9528861.html


Robert Llewellyn, one of the Turing Test Judges on Day 1: 6 June 2014, in the Guardian on his experience at The Royal Society:  http://www.theguardian.com/science/2014/jun/09/turing-test-eugene-goostman

BBC Radio 4 Today: John Humphrys interviews Eugene Goostman 11 June 2014:
http://www.bbc.co.uk/programmes/p020rrgx  and  http://www.bbc.co.uk/programmes/b006qj9z

BBC News: http://www.bbc.co.uk/news/technology-27762088

Gizmodo: http://gizmodo.com/this-is-the-first-computer-in-history-to-have-passed-th-1587780232

NBC News: http://www.nbcnews.com/tech/tech-news/turing-test-computer-program-convinces-judges-its-human-n125786

The Guardian:
http://www.theguardian.com/technology/2014/jun/09/scientists-disagree-over-whether-turing-test-has-been-passed

and

http://www.theguardian.com/technology/shortcuts/2014/jun/09/eugene-goostman-turing-test-computer-program

The Daily Telegraph:
http://www.telegraph.co.uk/technology/news/10884839/Computer-passes-Turing-Test-for-the-first-time-after-convincing-users-it-is-human.html

The Independent:
http://www.independent.co.uk/life-style/gadgets-and-tech/computer-becomes-first-to-pass-turing-test-in-artificial-intelligence-milestone-but-academics-warn-of-dangerous-future-9508370.html

The Verge:
http://mobile.theverge.com/2014/6/8/5790936/computer-passes-turing-test-for-first-time-by-convincing-judges-it-is

Wired:
http://www.wired.co.uk/news/archive/2014-06/09/turing-test-eugene

iFree press release; http://www.i-free.com/en/press/news/6052


BBC London report from Friday 6 June

http://www.bbc.co.uk/news/uk-england-london-27742565?post_id=129832

Report & pictures from Friday 6 June here

http://turingtestsin2014.blogspot.co.uk/2014/06/bbc-report-of-first-day-of-turing-tests.html


Animation for the event: https://www.youtube.com/watch?v=0Hgw9RVwbaw&feature=youtu.be



*For people wondering about methodology of the Royal Society Turing tests, see Warwick & Shah paper in IEEE Transactions on Computational Intelligence and AI in Games: 'Good Machine Performance in Turing's Imitation Game'  and also 'Human Misidentification in Turing tests' available to download for free (as at 15 June 2014) from Journal of Experimental and Theoretical AI here:

http://www.tandfonline.com/action/.U53HkmfjhWA#.U53UI9SwXMU


For people seeking transcripts of the conversations from the Royal Society tests, please note along with the Judges' scores these will be submitted in peer-reviewed scientific journals and conferences. Please be patient and note Professor Warwick's comment:


"As you might imagine we are yet to unravel the transcripts but when we do these will become available via the normal academic route through academic papers, with our commentary as support. When the papers appear so others will be able to examine the transcripts and see why 33.3% of the interrogators were convinced. We will most likely present each of the transcripts alongside their corresponding hidden human transcript as this is an important part of the tests."


We welcome researchers with different views on what the Turing test is to stage their own public Turing test event. Our interpretation comes from Huma Shah's PhD: Deception-detection and machine intelligence in Practical Turing tests, see also blog post What is Turing test success?

© Dr. Huma Shah 10 June 2014   - content (pictures/text) may be copied with acknowledgement.
[Revised 11 June 2014]
[Updated 14 June 2014, with Slate article link & more photos]
[2nd Update 15 June 2014: more photos and link to 'Human Misidentification in Turing tests' paper in Journal of Theoretical and Experimental AI]
[3rd Update: 27 June 2014: added more photos including of Robert Medeksza of Ultra Hal from Zabaware ]
[4th Update: 29 June 2014: adding Stevan Harnad's Tweets; Like everyone else, he hasn't been involved in a simultaneous comparison Turing test with a human and Eugene Goostman]
[5th Update 6 October 2014: added photograph of team behind development of Eugene Goostman machine]