|
code
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Designing an expert systemHi all,
I'm building a vb6 application with ado 2.81 and access 2003 as a backend. The app parse recipe files and import ingredients into a table. Now, let's take as an example 8 different recipes that have the same ingredient (eg. cilantro). When I parse these 8 recipes I will check if the ingredient is not already available in the table. If it is, i'll use it in the recipe. If not, I have to create a new one. So let's take these 8 recipes where they have the following ingredient line for cilantro: Ahh, accidentally posted without continuing!!
Here's the full post again: I'm building a vb6 application with ado 2.81 and access 2003 as a backend. The app parse recipe files and import ingredients into a table. Now, let's take as an example 8 different recipes that have the same ingredient (eg. cilantro). When I parse these 8 recipes I will check if the ingredient is not already available in the table. If it is, i'll use it in the recipe. If not, I have to create a new one. So let's take these 8 recipes where they have the following ingredient line for cilantro: Snipped fresh cilantro Cilantro -- finely chopped Leaves of Cilantro -- optional Chopped cilantro Cilantro, fresh, chopped Cilantro, chopped fresh cilantro leaves -- coarsely chopped Cilantro, minced The whole idea of the app is to have only one ingredient called cilantro and not 8. How would one go about designing an intelligent function that recognizes that cilantro is the ingredient and will use that instead of creating 8 different records in the table? Thanks, Ivan Ivan Debono wrote:
Show quoteHide quote > Ahh, accidentally posted without continuing!! In general, a difficult problem. If you do end up creating something> > Here's the full post again: > > I'm building a vb6 application with ado 2.81 and access 2003 as a backend. > The app parse recipe files and import ingredients into a table. > > Now, let's take as an example 8 different recipes that have the same > ingredient (eg. cilantro). When I parse these 8 recipes I will check if the > ingredient is not already available in the table. If it is, i'll use it in > the recipe. If not, I have to create a new one. So let's take these 8 > recipes where they have the following ingredient line for cilantro: > > Snipped fresh cilantro > Cilantro -- finely chopped > Leaves of Cilantro -- optional > Chopped cilantro > Cilantro, fresh, chopped > Cilantro, chopped > fresh cilantro leaves -- coarsely chopped > Cilantro, minced > > The whole idea of the app is to have only one ingredient called cilantro and > not 8. How would one go about designing an intelligent function that > recognizes that cilantro is the ingredient and will use that instead of > creating 8 different records in the table? that can parse descriptions like this as well a person, be sure to tell us, as you would probably be eligible for an AI prize or two :) Depending on how many recipes you have to process, you should first consider the time it will take to develop an automatic procedure versus the time it would take to do the data entry manually. The latter would be long and tedious but easy. The former might end up getting you nowhere. Here are some thoughts on how to approach this: Make various lists that contain words that might be in the recipe, but are not ingredients themselves. eg: - a list of verbs (and their various inflections) that describe ways in which ingredients might be prepared. Thus: chop(ped), snip(ped), mince(d) etc etc - a list of adjectives that describe ingredients. Thus: fresh, boiled, raw, premium etc When presented with an input string, remove any words that match entries on these lists. What you end up with might be ingredients. Look at the the outputs you get an iteratively add things to your lists. For example, your sample inputs end up (after removing punctuation also) as : cilantroCilantro finely Leaves of Cilantro optional cilantro Cilantro Cilantro cilantro leaves coarsely Cilantro So 'optional' becomes another list candidate, as do various adverbs that might modify the verbs you already have. Have a list of 'small words' eg a, an, the, of, in etc and your list becomes (after forcing to lower case) : cilantro cilantro leaves cilantro cilantro cilantro cilantro cilantro leaves cilantro We are almost there. 'Leaves' is tricky because it might actually be part of the ingredient (eg bay leaves). The final step would then be to have a mapping table that maps both 'cilantro leaves' and 'cilantro' to your actual ingredient record. You will find that this whole process will be a highly *iterative* one - set up your filters, run through the data, see what you get, refine your filters, run again, refine, run, ... until you end up with data that you can just put into a mapping table as described above, and then you are done. Until of course your next batch of input contains a word you've never encountered before... -- Larry Lard Replies to group please
Show quote
Hide quote
"Larry Lard" <larryl***@hotmail.com> schrieb im Newsbeitrag Guess what... I was thinking on the same lines as you!!news:1128687301.185126.39100@g43g2000cwa.googlegroups.com... > > Ivan Debono wrote: > > Ahh, accidentally posted without continuing!! > > > > Here's the full post again: > > > > I'm building a vb6 application with ado 2.81 and access 2003 as a backend. > > The app parse recipe files and import ingredients into a table. > > > > Now, let's take as an example 8 different recipes that have the same > > ingredient (eg. cilantro). When I parse these 8 recipes I will check if the > > ingredient is not already available in the table. If it is, i'll use it in > > the recipe. If not, I have to create a new one. So let's take these 8 > > recipes where they have the following ingredient line for cilantro: > > > > Snipped fresh cilantro > > Cilantro -- finely chopped > > Leaves of Cilantro -- optional > > Chopped cilantro > > Cilantro, fresh, chopped > > Cilantro, chopped > > fresh cilantro leaves -- coarsely chopped > > Cilantro, minced > > > > The whole idea of the app is to have only one ingredient called cilantro and > > not 8. How would one go about designing an intelligent function that > > recognizes that cilantro is the ingredient and will use that instead of > > creating 8 different records in the table? > > In general, a difficult problem. If you do end up creating something > that can parse descriptions like this as well a person, be sure to tell > us, as you would probably be eligible for an AI prize or two :) > > Depending on how many recipes you have to process, you should first > consider the time it will take to develop an automatic procedure versus > the time it would take to do the data entry manually. The latter would > be long and tedious but easy. The former might end up getting you > nowhere. > > Here are some thoughts on how to approach this: > > Make various lists that contain words that might be in the recipe, but > are not ingredients themselves. eg: > > - a list of verbs (and their various inflections) that describe ways in > which ingredients might be prepared. Thus: chop(ped), snip(ped), > mince(d) etc etc > > - a list of adjectives that describe ingredients. Thus: fresh, boiled, > raw, premium etc > > When presented with an input string, remove any words that match > entries on these lists. What you end up with might be ingredients. Look > at the the outputs you get an iteratively add things to your lists. For > example, your sample inputs end up (after removing punctuation also) as > : > > cilantro > Cilantro finely > Leaves of Cilantro optional > cilantro > Cilantro > Cilantro > cilantro leaves coarsely > Cilantro > > > So 'optional' becomes another list candidate, as do various adverbs > that might modify the verbs you already have. Have a list of 'small > words' eg a, an, the, of, in etc and your list becomes (after forcing > to lower case) : > > cilantro > cilantro > leaves cilantro > cilantro > cilantro > cilantro > cilantro leaves > cilantro > > We are almost there. 'Leaves' is tricky because it might actually be > part of the ingredient (eg bay leaves). The final step would then be to > have a mapping table that maps both 'cilantro leaves' and 'cilantro' to > your actual ingredient record. > > You will find that this whole process will be a highly *iterative* one > - set up your filters, run through the data, see what you get, refine > your filters, run again, refine, run, ... until you end up with data > that you can just put into a mapping table as described above, and then > you are done. > > Until of course your next batch of input contains a word you've never > encountered before... > > -- > Larry Lard > Replies to group please > Your verbs are actually my "preparation methods" which are stored in a separate table and already inside. So those are out of the way. Secondly, I have a similar "ignore" table that includes adjectives and prepositions that will be ignored when parsing the ingredient line. Thirdly, I also have an "aliases" table when the whole original ingredient line is linked with the unique ingredient id and saved. This saves lots of time if the same ingredient line is encountered. It will get the ingredient id directly from this table instead of breaking the done the ingredient line on a word-by-word basis. What I should do is test-parse all recipes, populate all the necessary tables and do the real parse. In this way, the algorithm will have lots of data to start with :) Thanks for the ideas!! Ivan On Fri, 7 Oct 2005 12:55:22 +0200, "Ivan Debono"
<ivanm***@hotmail.com> wrote: <snip> >Snipped fresh cilantro That is very interesting>Cilantro -- finely chopped >Leaves of Cilantro -- optional >Chopped cilantro >Cilantro, fresh, chopped >Cilantro, chopped >fresh cilantro leaves -- coarsely chopped >Cilantro, minced >The whole idea of the app is to have only one ingredient called cilantro and >not 8. How would one go about designing an intelligent function that >recognizes that cilantro is the ingredient and will use that instead of >creating 8 different records in the table? I recommend that you create a Lexicon - ie: a sorted list of words But not of the ingredients, rather of words that are not ingredients - eg: chopped '--' optional ',' of You might also make a list of adjectives, eg: Fresh, Dried - as there is a difference between Fresh Cilantro and Dried Cilantro Heck - you might as well have a list of ingredients as well By the time you have removed the 'non ingredients' what will be left will be an ingredient - and optionally an adjective The reasoning is that there are probably a lot more ingredients than non-ingrediants - also the system will 'fail safe' one can tolerate a bit of garbage, but missing a vital ingredient will screw thing up big time. Obviously this will require quite a lot of manual work ticking words off as non-ingredients, but a few simple utilities (written by you) will make that fairly simple. I can see a few problems with things like Vanilla Essence versus Vanilla Pod, but I think you'll get round that with the optional adjectives, failing that you could make compound words Incidentally quantities need looking at I've done similar things before as spelling checkers and search accelerators - they work quite nicely, especially when 'trained' I would just use text files for this.
Show quote
Hide quote
"J French" <erew***@nowhere.uk> schrieb im Newsbeitrag Vanilla Essence and Vanilla Pod will be 2 distinct ingredients, so compoundnews:4346645a.2849493@news.btopenworld.com... > On Fri, 7 Oct 2005 12:55:22 +0200, "Ivan Debono" > <ivanm***@hotmail.com> wrote: > > <snip> > > >Snipped fresh cilantro > >Cilantro -- finely chopped > >Leaves of Cilantro -- optional > >Chopped cilantro > >Cilantro, fresh, chopped > >Cilantro, chopped > >fresh cilantro leaves -- coarsely chopped > >Cilantro, minced > > >The whole idea of the app is to have only one ingredient called cilantro and > >not 8. How would one go about designing an intelligent function that > >recognizes that cilantro is the ingredient and will use that instead of > >creating 8 different records in the table? > > That is very interesting > > I recommend that you create a Lexicon > - ie: a sorted list of words > > But not of the ingredients, rather of words that are not ingredients > - eg: chopped '--' optional ',' of > > You might also make a list of adjectives, eg: Fresh, Dried > - as there is a difference between Fresh Cilantro and Dried Cilantro > > Heck - you might as well have a list of ingredients as well > > By the time you have removed the 'non ingredients' what will be left > will be an ingredient - and optionally an adjective > > The reasoning is that there are probably a lot more ingredients than > non-ingrediants - also the system will 'fail safe' one can tolerate a > bit of garbage, but missing a vital ingredient will screw thing up big > time. > > Obviously this will require quite a lot of manual work ticking words > off as non-ingredients, but a few simple utilities (written by you) > will make that fairly simple. > > I can see a few problems with things like Vanilla Essence versus > Vanilla Pod, but I think you'll get round that with the optional > adjectives, failing that you could make compound words > > Incidentally quantities need looking at > > I've done similar things before as spelling checkers and search > accelerators - they work quite nicely, especially when 'trained' > > I would just use text files for this. > > word such as these will also be handled. I have a mixture of tables and text files. I use tables where I need to link up with other tables and where the app stores the information it learns from. I use a text file so that ignore words can be added by the user. Ivan On Fri, 7 Oct 2005 14:58:21 +0200, "Ivan Debono"
<ivanm***@hotmail.com> wrote: <snip> >Vanilla Essence and Vanilla Pod will be 2 distinct ingredients, so compound Possibly: Vanilla_Essence and Vanilla_Pod>word such as these will also be handled. >I have a mixture of tables and text files. Sensible - personally I prefer to do my own filing>I use tables where I need to link up with other tables and where the app I was fascinated to see that Larry came up with a similar approach>stores the information it learns from. I use a text file so that ignore >words can be added by the user. - also that you were running down the same path Ralph's point is interesting, if I understood it correctly, he is suggesting a rigorous analysis of each recipe - a very good idea I envisaged a multiline textbox on the left of the page and a listbox with ingredient analysis on the right hand side, but my post was getting rambling so I kept it short I think that rigorous analysis is a very good idea, in a peculiar way, it might actually save work. This may sound odd, but I reckon that whenever one writes a 'compiler' one should also write a 'decompiler', it is a darn good way of showing up errors (amongst other things)
Show quote
Hide quote
"Ivan Debono" <ivanm***@hotmail.com> wrote in message Quick First Brush White Boardnews:eB9hUzyyFHA.736@tk2msftngp13.phx.gbl... > Ahh, accidentally posted without continuing!! > > Here's the full post again: > > I'm building a vb6 application with ado 2.81 and access 2003 as a backend. > The app parse recipe files and import ingredients into a table. > > Now, let's take as an example 8 different recipes that have the same > ingredient (eg. cilantro). When I parse these 8 recipes I will check if the > ingredient is not already available in the table. If it is, i'll use it in > the recipe. If not, I have to create a new one. So let's take these 8 > recipes where they have the following ingredient line for cilantro: > > Snipped fresh cilantro > Cilantro -- finely chopped > Leaves of Cilantro -- optional > Chopped cilantro > Cilantro, fresh, chopped > Cilantro, chopped > fresh cilantro leaves -- coarsely chopped > Cilantro, minced > > The whole idea of the app is to have only one ingredient called cilantro and > not 8. How would one go about designing an intelligent function that > recognizes that cilantro is the ingredient and will use that instead of > creating 8 different records in the table? > > Thanks, > Ivan > // table_Recipe DDL ID: seq_auto_number Name: Text ' DataView 1 Soup 2 Meatloaf // table_RecipeIngredientsXREF RecipeID: FK (number) IngredID: FK (number) ' dataview 1 2 1 3 // table_Ingredient DDL ID: seq_auto_number Item: FK (number) Source FK (number) Preparation: FK (number) ' dataview 1 1 1 1 2 2 2 5 3 3 1 1 // table_Item DDL ID: seq_auto_number Name: Text ' DataView 1 Cilantro 2 Onion 3 Celery // table_Source DDL ID: seq_auto_number Source: Text ' DataView 1 Fresh 2 Canned 3 Bagged // table_Preparation DDL ID: seq_auto_number Prepare: Text ' DataView 1 chopped 2 corsley chopped 3 snipped 4 whole 5 minced Should give you an idea. Note: You may not want to use an auto_number A complete unique readable tag would do. There are pros and cons. hth -ralph
Show quote
Hide quote
"Ralph" <nt_consultin***@yahoo.com> wrote in message Forgot to add that the tables would serve as both storage for "lookup" keysnews:orydnShpNqGM8NvenZ2dnUVZ_smdnZ2d@arkansas.net... > > "Ivan Debono" <ivanm***@hotmail.com> wrote in message > news:eB9hUzyyFHA.736@tk2msftngp13.phx.gbl... > > Ahh, accidentally posted without continuing!! > > > > Here's the full post again: > > > > I'm building a vb6 application with ado 2.81 and access 2003 as a > backend. > > The app parse recipe files and import ingredients into a table. > > > > Now, let's take as an example 8 different recipes that have the same > > ingredient (eg. cilantro). When I parse these 8 recipes I will check if > the > > ingredient is not already available in the table. If it is, i'll use it in > > the recipe. If not, I have to create a new one. So let's take these 8 > > recipes where they have the following ingredient line for cilantro: > > > > Snipped fresh cilantro > > Cilantro -- finely chopped > > Leaves of Cilantro -- optional > > Chopped cilantro > > Cilantro, fresh, chopped > > Cilantro, chopped > > fresh cilantro leaves -- coarsely chopped > > Cilantro, minced > > > > The whole idea of the app is to have only one ingredient called cilantro > and > > not 8. How would one go about designing an intelligent function that > > recognizes that cilantro is the ingredient and will use that instead of > > creating 8 different records in the table? > > > > Thanks, > > Ivan > > > > Quick First Brush White Board > > // table_Recipe DDL > ID: seq_auto_number > Name: Text > ' DataView > 1 Soup > 2 Meatloaf > > // table_RecipeIngredientsXREF > RecipeID: FK (number) > IngredID: FK (number) > ' dataview > 1 2 > 1 3 > > // table_Ingredient DDL > ID: seq_auto_number > Item: FK (number) > Source FK (number) > Preparation: FK (number) > ' dataview > 1 1 1 1 > 2 2 2 5 > 3 3 1 1 > > // table_Item DDL > ID: seq_auto_number > Name: Text > ' DataView > 1 Cilantro > 2 Onion > 3 Celery > > // table_Source DDL > ID: seq_auto_number > Source: Text > ' DataView > 1 Fresh > 2 Canned > 3 Bagged > > // table_Preparation DDL > ID: seq_auto_number > Prepare: Text > ' DataView > 1 chopped > 2 corsley chopped > 3 snipped > 4 whole > 5 minced > > Should give you an idea. > Note: You may not want to use an auto_number > A complete unique readable tag would do. > There are pros and cons. > > hth > -ralph > and also for the "gathering". A true expert system will also include 'training', eg. in the example used by Mr. French, if your parser identified "Essence" as an attribute, it merely adds it to the table and goes on. Obviously during training some silly configurations will appear. Many can be caught by just examining the various tables. -ralph
Show quote
Hide quote
"Ralph" <nt_consultin***@yahoo.com> schrieb im Newsbeitrag I've got all the tables... I just need to train it :)news:3JidnUb_G7Lw7dveRVn-hg@arkansas.net... > > "Ralph" <nt_consultin***@yahoo.com> wrote in message > news:orydnShpNqGM8NvenZ2dnUVZ_smdnZ2d@arkansas.net... > > > > "Ivan Debono" <ivanm***@hotmail.com> wrote in message > > news:eB9hUzyyFHA.736@tk2msftngp13.phx.gbl... > > > Ahh, accidentally posted without continuing!! > > > > > > Here's the full post again: > > > > > > I'm building a vb6 application with ado 2.81 and access 2003 as a > > backend. > > > The app parse recipe files and import ingredients into a table. > > > > > > Now, let's take as an example 8 different recipes that have the same > > > ingredient (eg. cilantro). When I parse these 8 recipes I will check if > > the > > > ingredient is not already available in the table. If it is, i'll use it > in > > > the recipe. If not, I have to create a new one. So let's take these 8 > > > recipes where they have the following ingredient line for cilantro: > > > > > > Snipped fresh cilantro > > > Cilantro -- finely chopped > > > Leaves of Cilantro -- optional > > > Chopped cilantro > > > Cilantro, fresh, chopped > > > Cilantro, chopped > > > fresh cilantro leaves -- coarsely chopped > > > Cilantro, minced > > > > > > The whole idea of the app is to have only one ingredient called cilantro > > and > > > not 8. How would one go about designing an intelligent function that > > > recognizes that cilantro is the ingredient and will use that instead of > > > creating 8 different records in the table? > > > > > > Thanks, > > > Ivan > > > > > > > Quick First Brush White Board > > > > // table_Recipe DDL > > ID: seq_auto_number > > Name: Text > > ' DataView > > 1 Soup > > 2 Meatloaf > > > > // table_RecipeIngredientsXREF > > RecipeID: FK (number) > > IngredID: FK (number) > > ' dataview > > 1 2 > > 1 3 > > > > // table_Ingredient DDL > > ID: seq_auto_number > > Item: FK (number) > > Source FK (number) > > Preparation: FK (number) > > ' dataview > > 1 1 1 1 > > 2 2 2 5 > > 3 3 1 1 > > > > // table_Item DDL > > ID: seq_auto_number > > Name: Text > > ' DataView > > 1 Cilantro > > 2 Onion > > 3 Celery > > > > // table_Source DDL > > ID: seq_auto_number > > Source: Text > > ' DataView > > 1 Fresh > > 2 Canned > > 3 Bagged > > > > // table_Preparation DDL > > ID: seq_auto_number > > Prepare: Text > > ' DataView > > 1 chopped > > 2 corsley chopped > > 3 snipped > > 4 whole > > 5 minced > > > > Should give you an idea. > > Note: You may not want to use an auto_number > > A complete unique readable tag would do. > > There are pros and cons. > > > > hth > > -ralph > > > > Forgot to add that the tables would serve as both storage for "lookup" keys > and also for the "gathering". A true expert system will also include > 'training', eg. in the example used by Mr. French, if your parser identified > "Essence" as an attribute, it merely adds it to the table and goes on. > > Obviously during training some silly configurations will appear. Many can be > caught by just examining the various tables. > > -ralph > > Ivan
Show quote
Hide quote
"Ivan Debono" <ivanm***@hotmail.com> wrote in message I can see that from your replies to the others. It wasn't obvious to me asnews:OCi$%238zyFHA.2556@TK2MSFTNGP10.phx.gbl... > > "Ralph" <nt_consultin***@yahoo.com> schrieb im Newsbeitrag > news:3JidnUb_G7Lw7dveRVn-hg@arkansas.net... > > > > "Ralph" <nt_consultin***@yahoo.com> wrote in message > > news:orydnShpNqGM8NvenZ2dnUVZ_smdnZ2d@arkansas.net... > > > > > > "Ivan Debono" <ivanm***@hotmail.com> wrote in message > > > news:eB9hUzyyFHA.736@tk2msftngp13.phx.gbl... > > > > Ahh, accidentally posted without continuing!! > > > > > > > > Here's the full post again: > > > > > > > > I'm building a vb6 application with ado 2.81 and access 2003 as a > > > backend. > > > > The app parse recipe files and import ingredients into a table. > > > > > > > > Now, let's take as an example 8 different recipes that have the same > > > > ingredient (eg. cilantro). When I parse these 8 recipes I will check > if > > > the > > > > ingredient is not already available in the table. If it is, i'll use > it > > in > > > > the recipe. If not, I have to create a new one. So let's take these 8 > > > > recipes where they have the following ingredient line for cilantro: > > > > > > > > Snipped fresh cilantro > > > > Cilantro -- finely chopped > > > > Leaves of Cilantro -- optional > > > > Chopped cilantro > > > > Cilantro, fresh, chopped > > > > Cilantro, chopped > > > > fresh cilantro leaves -- coarsely chopped > > > > Cilantro, minced > > > > > > > > The whole idea of the app is to have only one ingredient called > cilantro > > > and > > > > not 8. How would one go about designing an intelligent function that > > > > recognizes that cilantro is the ingredient and will use that instead > of > > > > creating 8 different records in the table? > > > > > > > > Thanks, > > > > Ivan > > > > > > > > > > Quick First Brush White Board > > > > > > // table_Recipe DDL > > > ID: seq_auto_number > > > Name: Text > > > ' DataView > > > 1 Soup > > > 2 Meatloaf > > > > > > // table_RecipeIngredientsXREF > > > RecipeID: FK (number) > > > IngredID: FK (number) > > > ' dataview > > > 1 2 > > > 1 3 > > > > > > // table_Ingredient DDL > > > ID: seq_auto_number > > > Item: FK (number) > > > Source FK (number) > > > Preparation: FK (number) > > > ' dataview > > > 1 1 1 1 > > > 2 2 2 5 > > > 3 3 1 1 > > > > > > // table_Item DDL > > > ID: seq_auto_number > > > Name: Text > > > ' DataView > > > 1 Cilantro > > > 2 Onion > > > 3 Celery > > > > > > // table_Source DDL > > > ID: seq_auto_number > > > Source: Text > > > ' DataView > > > 1 Fresh > > > 2 Canned > > > 3 Bagged > > > > > > // table_Preparation DDL > > > ID: seq_auto_number > > > Prepare: Text > > > ' DataView > > > 1 chopped > > > 2 corsley chopped > > > 3 snipped > > > 4 whole > > > 5 minced > > > > > > Should give you an idea. > > > Note: You may not want to use an auto_number > > > A complete unique readable tag would do. > > > There are pros and cons. > > > > > > hth > > > -ralph > > > > > > > Forgot to add that the tables would serve as both storage for "lookup" > keys > > and also for the "gathering". A true expert system will also include > > 'training', eg. in the example used by Mr. French, if your parser > identified > > "Essence" as an attribute, it merely adds it to the table and goes on. > > > > Obviously during training some silly configurations will appear. Many can > be > > caught by just examining the various tables. > > > > -ralph > > > > > > I've got all the tables... I just need to train it :) > > Ivan > to how far you had gotten in your design, from you OP. Apologize if I was too pedantic. -ralph > No problem.> I can see that from your replies to the others. It wasn't obvious to me as > to how far you had gotten in your design, from you OP. Apologize if I was > too pedantic. > > -ralph > > I also included to features after posting. 1. A Soundex function. Compares the soundex codes and assigns a rating 2. A rating system by breaking down the sentence into words and comparing those. I sort by rating in desc order will give the highest rating. 98% of the time it's really accurate but if fails miserably in the remaining 2%. You guys would do something similar or an alternative rating system? Ivan
Show quote
Hide quote
"Ivan Debono" <ivanm***@hotmail.com> schrieb im Newsbeitrag Would it make sense to implement regular expressions and what would thenews:OQmj5qLzFHA.2960@tk2msftngp13.phx.gbl... > > > > I can see that from your replies to the others. It wasn't obvious to me as > > to how far you had gotten in your design, from you OP. Apologize if I was > > too pedantic. > > > > -ralph > > > > > > No problem. > > I also included to features after posting. > > 1. A Soundex function. Compares the soundex codes and assigns a rating > 2. A rating system by breaking down the sentence into words and comparing > those. > > I sort by rating in desc order will give the highest rating. 98% of the time > it's really accurate but if fails miserably in the remaining 2%. > > You guys would do something similar or an alternative rating system? > > Ivan > > search string look like? Ivan
Show quote
Hide quote
"Ivan Debono" <ivanm***@hotmail.com> wrote in message Regular expressions definitely makes pattern-matching searches easier, butnews:eoilgwLzFHA.2008@TK2MSFTNGP10.phx.gbl... > > "Ivan Debono" <ivanm***@hotmail.com> schrieb im Newsbeitrag > news:OQmj5qLzFHA.2960@tk2msftngp13.phx.gbl... > > > > > > I can see that from your replies to the others. It wasn't obvious to me > as > > > to how far you had gotten in your design, from you OP. Apologize if I > was > > > too pedantic. > > > > > > -ralph > > > > > > > > > > No problem. > > > > I also included to features after posting. > > > > 1. A Soundex function. Compares the soundex codes and assigns a rating > > 2. A rating system by breaking down the sentence into words and comparing > > those. > > > > I sort by rating in desc order will give the highest rating. 98% of the > time > > it's really accurate but if fails miserably in the remaining 2%. > > > > You guys would do something similar or an alternative rating system? > > > > Ivan > > > > > > Would it make sense to implement regular expressions and what would the > search string look like? > > Ivan > it carries a lot of overhead. Though it simplifies the 'code' - you write a pattern and submit something to be 'parsed' - there is a very complex engine running underneath. The advantage is you didn't have to write the engine. So they are useful in that regard. I would definitely use them in the beginning. Saves a lot of writing, but it is only an implementation tool. Note RE only works with 'patterns', ie, you have to have a discernable pattern before they do you any good. Obviously a recipe that contained a bulleted list of ingredients at the top or other defined structure would lend itself to defining a pattern, but what about some plain text that adds ingredients as it rattles on? OSFA will not apply. Your task can be broken down into Input massage -> Knowledgebase -><- Forward Chaining -><- Inference Engine -><-- Backward Chaining -> Output. You will likey use a ton of different tools as you work your way through the process. There is a lot of useful information on the net and several books on the subject. It can get complicated. Don't leave out the chaining. Automate the 'learning' whenever possible - else you will find yourself merely creating a linear procedural process with a ton of conditionals. <g> But take heart that what you are doing is creating in a sense an 'expert' system. No two solutions are ever the same - depends on what 'expertise' is needed in the output. Also by the time you are done, you will likely be eligible for a good job at Google. My only exposure to AI/Expert systems is in training and backward chaining (Neural Nets). The knowledgebase and Inference engines were rather well defined before I got there. (Thank ghod! <g>) When you get there give me a call. <smile> Enjoy. I can't think of a more fascinating project. I know I thoughly enjoyed every one that I have been involved in and missed it when I left. -ralph
Show quote
Hide quote
"Ralph" <nt_consultin***@yahoo.com> schrieb im Newsbeitrag Did you develop the systems in vb?news:58adnZTIRONJstTeRVn-jg@arkansas.net... > > "Ivan Debono" <ivanm***@hotmail.com> wrote in message > news:eoilgwLzFHA.2008@TK2MSFTNGP10.phx.gbl... > > > > "Ivan Debono" <ivanm***@hotmail.com> schrieb im Newsbeitrag > > news:OQmj5qLzFHA.2960@tk2msftngp13.phx.gbl... > > > > > > > > I can see that from your replies to the others. It wasn't obvious to > me > > as > > > > to how far you had gotten in your design, from you OP. Apologize if I > > was > > > > too pedantic. > > > > > > > > -ralph > > > > > > > > > > > > > > No problem. > > > > > > I also included to features after posting. > > > > > > 1. A Soundex function. Compares the soundex codes and assigns a rating > > > 2. A rating system by breaking down the sentence into words and > comparing > > > those. > > > > > > I sort by rating in desc order will give the highest rating. 98% of the > > time > > > it's really accurate but if fails miserably in the remaining 2%. > > > > > > You guys would do something similar or an alternative rating system? > > > > > > Ivan > > > > > > > > > > Would it make sense to implement regular expressions and what would the > > search string look like? > > > > Ivan > > > > Regular expressions definitely makes pattern-matching searches easier, but > it carries a lot of overhead. Though it simplifies the 'code' - you write a > pattern and submit something to be 'parsed' - there is a very complex engine > running underneath. The advantage is you didn't have to write the engine. So > they are useful in that regard. I would definitely use them in the > beginning. Saves a lot of writing, but it is only an implementation tool. > > Note RE only works with 'patterns', ie, you have to have a discernable > pattern before they do you any good. Obviously a recipe that contained a > bulleted list of ingredients at the top or other defined structure would > lend itself to defining a pattern, but what about some plain text that add s > ingredients as it rattles on? OSFA will not apply. > > Your task can be broken down into Input massage -> Knowledgebase -><- > Forward Chaining -><- Inference Engine -><-- Backward Chaining -> Output. > You will likey use a ton of different tools as you work your way through the > process. There is a lot of useful information on the net and several books > on the subject. It can get complicated. > > Don't leave out the chaining. Automate the 'learning' whenever possible - > else you will find yourself merely creating a linear procedural process with > a ton of conditionals. <g> > > But take heart that what you are doing is creating in a sense an 'expert' > system. No two solutions are ever the same - depends on what 'expertise' is > needed in the output. Also by the time you are done, you will likely be > eligible for a good job at Google. > > My only exposure to AI/Expert systems is in training and backward chaining > (Neural Nets). The knowledgebase and Inference engines were rather well > defined before I got there. (Thank ghod! <g>) When you get there give me a > call. <smile> > > Enjoy. I can't think of a more fascinating project. I know I thoughly > enjoyed every one that I have been involved in and missed it when I left. > > -ralph > > > Ivan
Show quote
Hide quote
"Ivan Debono" <ivanm***@hotmail.com> wrote in message Yes, the training and presentation parts anyway. But never the engines,news:%23wvXoXWzFHA.3256@TK2MSFTNGP09.phx.gbl... > > "Ralph" <nt_consultin***@yahoo.com> schrieb im Newsbeitrag > news:58adnZTIRONJstTeRVn-jg@arkansas.net... > > > > "Ivan Debono" <ivanm***@hotmail.com> wrote in message > > news:eoilgwLzFHA.2008@TK2MSFTNGP10.phx.gbl... > > > > > > "Ivan Debono" <ivanm***@hotmail.com> schrieb im Newsbeitrag > > > news:OQmj5qLzFHA.2960@tk2msftngp13.phx.gbl... > > > > > > > > > > I can see that from your replies to the others. It wasn't obvious to > > me > > > as > > > > > to how far you had gotten in your design, from you OP. Apologize if > I > > > was > > > > > too pedantic. > > > > > > > > > > -ralph > > > > > > > > > > > > > > > > > > No problem. > > > > > > > > I also included to features after posting. > > > > > > > > 1. A Soundex function. Compares the soundex codes and assigns a rating > > > > 2. A rating system by breaking down the sentence into words and > > comparing > > > > those. > > > > > > > > I sort by rating in desc order will give the highest rating. 98% of > the > > > time > > > > it's really accurate but if fails miserably in the remaining 2%. > > > > > > > > You guys would do something similar or an alternative rating system? > > > > > > > > Ivan > > > > > > > > > > > > > > Would it make sense to implement regular expressions and what would the > > > search string look like? > > > > > > Ivan > > > > > > > Regular expressions definitely makes pattern-matching searches easier, but > > it carries a lot of overhead. Though it simplifies the 'code' - you write > a > > pattern and submit something to be 'parsed' - there is a very complex > engine > > running underneath. The advantage is you didn't have to write the engine. > So > > they are useful in that regard. I would definitely use them in the > > beginning. Saves a lot of writing, but it is only an implementation tool. > > > > Note RE only works with 'patterns', ie, you have to have a discernable > > pattern before they do you any good. Obviously a recipe that contained a > > bulleted list of ingredients at the top or other defined structure would > > lend itself to defining a pattern, but what about some plain text that add > s > > ingredients as it rattles on? OSFA will not apply. > > > > Your task can be broken down into Input massage -> Knowledgebase -><- > > Forward Chaining -><- Inference Engine -><-- Backward Chaining -> Output. > > You will likey use a ton of different tools as you work your way through > the > > process. There is a lot of useful information on the net and several books > > on the subject. It can get complicated. > > > > Don't leave out the chaining. Automate the 'learning' whenever possible - > > else you will find yourself merely creating a linear procedural process > with > > a ton of conditionals. <g> > > > > But take heart that what you are doing is creating in a sense an 'expert' > > system. No two solutions are ever the same - depends on what 'expertise' > is > > needed in the output. Also by the time you are done, you will likely be > > eligible for a good job at Google. > > > > My only exposure to AI/Expert systems is in training and backward chaining > > (Neural Nets). The knowledgebase and Inference engines were rather well > > defined before I got there. (Thank ghod! <g>) When you get there give me a > > call. <smile> > > > > Enjoy. I can't think of a more fascinating project. I know I thoughly > > enjoyed every one that I have been involved in and missed it when I left. > > > > -ralph > > > > > > > > Did you develop the systems in vb? > > Ivan > always used the assistance of some libraries. STATISTICA Neural Networks and other tools (http://www.statsoft.com) was my favorite. (I rewrote the VB Interface and typelib for them, so got a free copy, but that was a few years ago. They have newer stuff now, so you can trust it. <g>) But you can find a number of other tools for free on the web. My philosophy is let the bright kids at MIT and Google write the stuff, I just want to use it. You won't find free 'complete' suites or IDEs, just a lot of bits and pieces. But you will find you can assemble a rather well working collage. With a little practice you can become a great thief. <g> Also, appreciate that elegance in the beginning is wasted effort. If ever there is a case of "code to understand" an expert system is it. VB6 is great tool as your 'glue' to hold everything together, due to its rapid prototyping/development abilities. You will probably want to learn C or Java for most of stuff you will find on the web. Many come in other languages, but I found that C made it a lot easier to port to a component I could use with VB. Even if you don't use the C itself, you can generally get the gist enough to convert to VB. I played with prolog and Lisp, ideal for some things - but the trouble is they are so odd - what do you do with d**n thing when you are done? Too much impedance mismatch with the 'real' world. But that is just me - your mileage may vary. Anyway I digress. You are still in the knowledgebase portion of the project. Scour the net for lexicon, parsing algorithms, etc. If you run into trouble post your question here. There are quite a few bright people around here who can do that stuff in their sleep - plus, when the smoke clears Rick will come in and supply a one-line solution. <g> Good luck, -ralph |
|||||||||||||||||||||||