25 Sep 2007

Extract Data From Wordnet

Get a word’s definition(s), synonyms and hyponyms (related words) using perl and WordNet. The word must be specified as a command line argument, but the script could easily be modified to use an html parameter.

What you need:

  1. Download and Install WordNet

  2. Download and Install this Perl interface by Jason Rennie

  3. Download this Perl Script:

  4. usage: perl wordnet.pl word

    #!/usr/bin/perl -w
    use strict;
    use warnings;
    use WordNet::QueryData;

    my $word = $ARGV[0] or exit(usage());
    my $wn = WordNet::QueryData->new;
    my @senses = $wn->querySense("$word#n", “glos”);
    my $i = 1;

    foreach my $sense(@senses) {
    my $temp_sense = join(",", $wn->querySense($sense, “glos”));
    my $temp_synset = join(",", $wn->querySense($sense, “syns”));
    my $temp_hyponyms = join(",", $wn->querySense($sense, “hypo”));
    $temp_synset =~ s/#\w#\d+//g;
    $temp_hyponyms =~ s/#\w#\d+//g;
    $temp_synset =~ s// /g;
    $temp_hyponyms =~ s/
    / /g;
    print $i.". “.$temp_sense."\n”;
    print “Synonyms: “.$temp_synset."\n”;
    print “Related: “.$temp_hyponyms."\n”;
    $i++;
    }

    sub usage{
    print “\n\n\nUsage: “.$0.” word\n\n\n\n”;
    }