Actions

NARG: Difference between revisions

From HacDC Wiki

No edit summary
Line 62: Line 62:
In attendance were Daniel, Mike, and Todd F
In attendance were Daniel, Mike, and Todd F


We discussed a web spidering project and started looking into a python project using mechanize, beautiful soup, and NLTK. Code downloads president wikipedia entries, pickles and saves them, cleans and saves them. Next step is to tokenize and process in NLTK. Will be put into a git repos (when Todd gets time). When you run it the first time it will download and serialize the data from wikipedia. (Please check wikipedia terms and conditions, license, EULA, etc before running)[[File:narg_pypres.tgz]]
We discussed a web spidering project and started looking into a python project using mechanize, beautiful soup, and NLTK. Code downloads president wikipedia entries, pickles and saves them, cleans and saves them. Next step is to tokenize and process in NLTK. Will be put into a git repos (when Todd gets time). When you run it the first time it will download and serialize the data from wikipedia. (Please check wikipedia terms and conditions, license, EULA, etc before running)  
 
* [[File:narg_pypres.tgz]]


== Meeting minutes for Apr 1 2010 ==
== Meeting minutes for Apr 1 2010 ==

Revision as of 02:25, 16 April 2010

Welcome to the HacDC Natural Language Processing and Artificial Intelligence Group (NARG)

Overview

The mission of NARG is to bring HacDC community members that are interested in NLP and AI together for research, projects, and knowledge sharing. Supporting members in getting projects done is the primary goal. Contact User:Obscurite for more info.

Reference and Resources

Add links to AI/NLP reference material, courseware, etc.

Members

Some profiles of our members and what they're into:

  • User:Obscurite (Daniel Packer) - Interested in emotional interfaces, responsive human interfaces, brain and bio signals, intelligent metadata, and cyborg tech.
  • Philip Stewart - Primarily interested in figurative language comprehension, semantics, and digital poetics. Secondarily, event-related potential (ERP) studies, consciousness, and applying scientific findings to philosophical "problematics" in novel ways. Coursework in psycholinguistics, physiological psychology, pharmacology, and functional neuroanatomy.
  • Darius Roberts - Interested in health, but if there was a way to make a white-label vark.com that would be my first choice of projects.
  • Todd Fine - Interested in analyzing the stream of meaning from humans on the internet -- twitter is especially curious. I am a bit obsessed with text-to-speech integrated into ambient soundscapes. Have flirted with various machine learning and ai algorithms, but always need to refresh. I am also interested in simple game AI and strategy. Also like computer word games and computer-generated theater/poetry. Have used nltk and would like to learn more.
  • Al Haraka
  • Phil Kimmey: Interest in AI, with a primary interest in learning more about non-deterministic approaches and applications, which hopefully will lead to an interest in NLP as well.
  • Mike Daren (User:Mdaren) - Most experience in discrete event-based simulations.
  • Michael

Meetings

NARG meets on Thursdays at HacDC from 7-9pm. AI and NLP focus switch every week to give folks 2 weeks to digest the previous meeting's content/projects.

Other events and cancelations will be announced via the mailing list. Check it out! NARG mailman page


Meeting minutes for Apr 15 2010

In attendance were Daniel, Mike, and Todd F

We discussed a web spidering project and started looking into a python project using mechanize, beautiful soup, and NLTK. Code downloads president wikipedia entries, pickles and saves them, cleans and saves them. Next step is to tokenize and process in NLTK. Will be put into a git repos (when Todd gets time). When you run it the first time it will download and serialize the data from wikipedia. (Please check wikipedia terms and conditions, license, EULA, etc before running)

Meeting minutes for Apr 1 2010

In attendance were Daniel, Darius, Todd, and Brad,

We attempted to get Darius's Rovio robot going, but had networking issues. Todd did an overview on K means clustering algorithm and clustering in general using the Collective Intelligence book (listed in resources) as a reference. Brad gave some insights into generalization of the Euclidean distance calculations from a math perspective - there are different distance equations for clustering and he mentioned at NASA manhattan distance was very useful for artificial vision. We brainstormed on ways to use clustering for social networks and other web databases. We also discussed potential hadoop/map reduce projects using pycloud or other cloud processing services. The meeting closed with burritos, fried tacos, and a bit of late night hacking.

Meeting minutes for Mar 18 2010

In attendance were Daniel, Brad, Mike, and New-Mike.

We had a general discussion about many things.

Meeting minutes for Mar 11 2010

In attendance were Darius, Daniel, Brad, Phil Stewart, Mike, and A.J.

Brad presented on Subsumption architectures. He will attach slides for this and the previous presentation. We watched a Breve demo of Brad's subsumption implementation (a very abstracted version equivalent to nested ifs), and he did some live coding which was fun.

Brad suggested a long term contest idea analogous to Hackerspaces in Space, maybe using pygame. We discussed various ideas that would make fun competitions.

Meeting minutes for Mar 4 2010

Second hand minutes about meeting from Daniel (did not attend due to sched. conflict):

  • NLTK intro from Todd Fine (first few chapters of NLTK book - see resources section for link)
  • Discussion of approaches to AI vs NLP in group (AI more game/sim oriented NLP more machine learning oriented i.e. bayesian)

Meeting minutes for Feb 25 2010

In attendance were Nikolas, Todd, Phil, Michael, Daniel, Brad, and Darius.

  • We agreed to alternate AI and NLP topics every other week to give people more time to digest material and lighten the burden of presenters/teachers
  • Daniel will present on NLP/NLTK next meeting

Brad did a great demo of several Breve simulations including the capture the flag simulation he ported to python from a class he'd taken. We looked at simulations of Braitenberg machines moving towards or away from stimulation sources. We analyzed the two existing CTF bots and looked at the code that defines them, and asked Brad a lot of questions about what the bots could do in code (there are a lot of specifics!) We're supposed to install Breve for the next AI focus meeting and start poking at the code.

During Brad's presentation at the point where he briefly covered AI history, there was a fascinating conversation between Brad, Nikolas and Todd about ways to define and contrast machine learning and AI. In the end it seemed the consensus was that machine learning is a rigorous academic field with a focus on mathematics and numerical analysis, whereas AI is more general, and has a more philosophical bent. Brad said that in his school days, the machine learning profs would make a point to say they weren't in "AI". Nikolas posited that it might be due to the stigma AI received from it's failures to achieve the rapid results it promised early on, and that seemed logical.

The code for CTF has been put up on a github

Meeting minutes for Feb 20 2010

The first NARG meeting was held on Feb 20, 2010 at Sticky Fingers Bakery. In attendance were Brad, Darius, Phil (not Stewart - a HacDC newcomer), and Daniel. The conversation was relatively free form but a few suggestions were favored:

  • Meetings will be ongoing at HacDC on Thursday evenings at 7pm, realizing that due to the high frequency of meetings, some folks will miss some meetings.
  • Brad will put together a demo/tutorial using the spiderland.org breve environment on Brattenberg Vehicles as an entry point into AI learning. We will collectively try to use this environment (virtual 3d world with actuators and sensors for 3d movement and input) and graduate to Subsumption Architectures and neural nets. We'll use python since most people are willing to use it and have at least played with it, though Brad personally prefers Steve (spelling?? - some unholy combo of smalltalk and javascript?) (correct this info)
  • Daniel will put together a demo/tutorial based on NLTK and the book "Natural Language Processing with Python", which he has a copy of for reference.
  • We will eventually choose a robotics platform for physical AI, either a repurposed roomba type solution (favored by Phil) or an open avr/arduino/ucontroller based bot like: http://www.adafruit.com/blog/2009/04/20/arduino-powered-braitenberg-vehicle-light-seeking-robot/

Other topics:

  • Brad, Todd, Darius and Daniel have downloaded the google AI tron code - Brad and Todd have working custom code and we will keep an eye out for good show and tell opportunities. Brad's solution is a neural net based one.
  • Daniel brought up the idea of machine readable codification of human ideas/statements and the political ramifications after Phil mentioned .gov open data and how it's not well formatted for real time use. Brad mentioned the language http://www.lojban.org/tiki/Lojban - which attempts to remove ambiguity.
  • Daniel is interested in using AI for bio signals interpretation and NLP for emotionally contextual interfaces/digital ghosts. Darius is interested in using NLP for matching content with expertise, like http://vark.com which got acquired by google a week or so ago. Brad is interested in AI as a practitioner (it's his job) and wants to do some virtual 3d simulations. Phil is open to pretty much anything (he's too young to know better).
  • Brad suggested there were ways to bridge AI and NLP. The idea of bridging NLP and AI via the use of agent based AI that use NLP based communication models in evolutionary scenarios was brought up by Daniel and it generally convinced everyone there were some exciting potential bridges between the two disciplines.