• given Dict=(act, eat, sat, ...) make a long chain (no repeats) with 2-l

    From HenHanna@21:1/5 to All on Mon Jun 17 12:01:27 2024
    XPost: sci.lang, sci.math

    given (a list of 3-letter words)
    Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...)

    The object is to make a long chain (no repeats) with 2-letter overlaps.

    e.g. -- [cat, ate, tea, eat, ATT, ...]



    What's a good approach (in Python)?


    in Mathematica, it's easy to find THE Longest chain?

    is this a typical NP-complete problem?

    ________________

    -- Martha has aspirin in industrial allotments.

    -- Two women enter erotic icehouse, seduce celibate teacher.

    -- Rush showed editorial alarmism, smeared educational alliance ceaselessly.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Waldby@21:1/5 to HenHanna on Mon Jun 17 23:24:15 2024
    XPost: sci.lang, sci.math

    In sci.math HenHanna <HenHanna@devnull.tb> wrote:
    given (a list of 3-letter words)
    Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...)

    The object is to make a long chain (no repeats) with 2-letter overlaps.

    e.g. -- [cat, ate, tea, eat, ATT, ...]

    What's a good approach (in Python)?

    According to ref 1, longest-path problems are NP-complete. At the
    moment there's no method known that is "good" in general for the
    problem. However, if all of the dictionary words are chosen from a natural-language, then we have a special (not general) case. I think
    in this special case a method like finding pairs, then combining pairs
    to triple, then triples to fives, fives to nines, etc, might work
    well, given obvious fallbacks to combining different-length sequences
    when at some length same-length combinations don't exist.

    *Ref 1 <https://en.wikipedia.org/wiki/Longest_path_problem#NP-hardness>
    *Ref 2 <https://en.wikipedia.org/wiki/NP-completeness>


    in Mathematica, it's easy to find THE Longest chain?

    is this a typical NP-complete problem?

    As noted in ref 2, "A problem p in NP is NP-complete if every other
    problem in NP can be transformed (or reduced) into p in polynomial
    time", so there is a sense in which every NP-complete problem is a
    typical NP-complete problem.

    ________________

    -- Martha has aspirin in industrial allotments.

    -- Two women enter erotic icehouse, seduce celibate teacher.

    -- Rush showed editorial alarmism, smeared educational alliance ceaselessly.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From HenHanna@21:1/5 to James Waldby on Mon Jun 17 19:32:58 2024
    XPost: sci.lang, sci.math

    On 6/17/2024 4:24 PM, James Waldby wrote:
    In sci.math HenHanna <HenHanna@devnull.tb> wrote:
    given (a list of 3-letter words)
    Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...)

    The object is to make a long chain (no repeats) with 2-letter overlaps.

    e.g. -- [cat, ate, tea, eat, ATT, ...]

    What's a good approach (in Python)?

    According to ref 1, longest-path problems are NP-complete. At the
    moment there's no method known that is "good" in general for the
    problem. However, if all of the dictionary words are chosen from a natural-language, then we have a special (not general) case. I think
    in this special case a method like finding pairs, then combining pairs
    to triple, then triples to fives, fives to nines, etc, might work
    well, given obvious fallbacks to combining different-length sequences
    when at some length same-length combinations don't exist.

    *Ref 1 <https://en.wikipedia.org/wiki/Longest_path_problem#NP-hardness>
    *Ref 2 <https://en.wikipedia.org/wiki/NP-completeness>


    in Mathematica, it's easy to find THE Longest chain?

    is this a typical NP-complete problem?

    As noted in ref 2, "A problem p in NP is NP-complete if every other
    problem in NP can be transformed (or reduced) into p in polynomial
    time", so there is a sense in which every NP-complete problem is a
    typical NP-complete problem.

    ________________

    -- Martha has aspirin in industrial allotments.

    -- Two women enter erotic icehouse, seduce celibate teacher.

    -- Rush showed editorial alarmism, smeared educational alliance ceaselessly.


    thanks!




    word Chain with 3-letter overlaps (between successive words)

    10 (shtoomest, estrayed, yedo, edomitish,
    ish, ishime, imelle, ller, lerps, rps)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Waldby@21:1/5 to HenHanna on Sat Jul 6 22:08:53 2024
    XPost: sci.lang, sci.math

    In sci.math HenHanna <HenHanna@devnull.tb> wrote:
    On 6/17/2024 4:24 PM, James Waldby wrote:
    In sci.math HenHanna <HenHanna@devnull.tb> wrote:
    given (a list of 3-letter words)
    Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...) >>>
    The object is to make a long chain (no repeats) with 2-letter overlaps.
    e.g. -- [cat, ate, tea, eat, ATT, ...]
    What's a good approach (in Python)?

    According to ref 1, longest-path problems are NP-complete. At the
    moment there's no method known that is "good" in general for the
    problem. However, if all of the dictionary words are chosen from a
    natural-language, then we have a special (not general) case. I think
    in this special case a method like finding pairs, then combining pairs
    to triple, then triples to fives, fives to nines, etc, might work
    well, given obvious fallbacks to combining different-length sequences
    when at some length same-length combinations don't exist.

    *Ref 1 <https://en.wikipedia.org/wiki/Longest_path_problem#NP-hardness>
    *Ref 2 <https://en.wikipedia.org/wiki/NP-completeness>

    in Mathematica, it's easy to find THE Longest chain?
    is this a typical NP-complete problem?

    As noted in ref 2, "A problem p in NP is NP-complete if every other
    problem in NP can be transformed (or reduced) into p in polynomial
    time", so there is a sense in which every NP-complete problem is a
    typical NP-complete problem.
    ...
    word Chain with 3-letter overlaps (between successive words)

    10 (shtoomest, estrayed, yedo, edomitish,
    ish, ishime, imelle, ller, lerps, rps)

    I wrote some programs to look for word chains, first on the plan
    mentioned earlier -- finding pairs, then combining pairs to triples,
    triples to fives, etc which worked ok to find chains of up to 30 or 40
    words, but this turned out to be fairly slow and of course memory
    intensive, running out of memory after a few hours on my 64GB-ram
    system. Then I tried depth-first recursion, which in .1 second can
    find chains of 150+ words, using 6-letter dictionary words with
    3-letter overlaps, but taking 1.5 hours to find its first 171-word
    chain, 11 hours to find a 174-chain, 30 hours for a 177-chain, then no
    progress in the next 142 hours of computation. With a couple of minor
    program changes I got the 174 time below 6 minutes, and time to 177
    around 14 minutes, but no progress beyond that in the next 18 hours.

    Here's an 8-letter/4-overlap 10-chain example, based on an
    88-word dictionary of 8-letter words, all containing `over`:
    10 : discover overhang hangover overhung hungover overtake
    takeover overturn turnover overacts at 0.201194 seconds

    Example 6-letter/3-overlap 177-chain:
    177 : abacus cuspid pidgin ginger gerbil billet lethal halter terser
    serene enemas mascot cotton tonsil silent entice icebox boxcar carbon
    bonbon bongos gospel pellet letups upshot hotbed bedbug bugles lessee
    seeped pedant anthem hempen pennon noncom combat bather herbal balder
    dermis misfit fitful fulcra craven vendor dormer merman manful fulfil
    filthy thymus muscat catnap napped pedlar largos gossip siphon honcho
    choice icecap captor torpor portal talent entrap rapper permit mitten
    tenpin pintos tosses sesame amebas basins instil tildes despot potash
    ashcan cancan cancel cellar larvas vassal salver verbal ballad ladles
    lessen sensor sorbet betcha chapel pelvis viscus custom tombed bedlam
    lambed bedpan pantry tryout outlay layout outran rancor cornea neared
    redden denser server vermin minute uterus russet settee teepee peewee
    weevil villas lassie siesta stamen mentor torque queasy asylum lumbar
    barfed fedora orator torrid ridden dentin tinsel selves vessel selfie
    fiesta stadia dialog logout outrun runway waylay layman mantra trains
    insult ultras rascal callus lustre tremor mortar tartar tarpon poncho
    chores resend endear earwig wigwam wampum pumper person sonnet nether
    hernia niacin cinder derail ailing ingest esters ersatz at 820.516 seconds

    Example 8-letter/4-overlap 79-chain:
    79: abjuring ringside sidekick kickback backbone bonehead headland
    landlady ladyship shipload loadstar starfish fishtail tailgate
    gatepost postdate dateline linefeed feedback backfire fireside
    sidelong longhand handbook bookcase casework workbook bookworm
    wormwood woodcock cocksure surefire firewood woodland landmark
    markdown downhill hillside sideshow showdown downturn turnover
    overcome comeback backhand handrail railroad roadkill killdeer
    deerskin skinhead headword wordplay playback backstop stopover
    overhand handsome somewhat whatever evermore moreover overhang
    hangover overhung hungover overtake takeover overdraw drawback
    backrest restrain raindrop dropouts outshone honester sternest
    nestling lingered at 54577.844678 seconds

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From HenHanna@21:1/5 to James Waldby on Sat Jul 6 22:42:38 2024
    XPost: sci.lang, sci.math

    On 7/6/2024 3:08 PM, James Waldby wrote:
    In sci.math HenHanna <HenHanna@devnull.tb> wrote:
    On 6/17/2024 4:24 PM, James Waldby wrote:
    In sci.math HenHanna <HenHanna@devnull.tb> wrote:
    given (a list of 3-letter words)
    Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...) >>>>
    The object is to make a long chain (no repeats) with 2-letter overlaps. >>>> e.g. -- [cat, ate, tea, eat, ATT, ...] >>>> What's a good approach (in Python)?

    According to ref 1, longest-path problems are NP-complete. At the
    moment there's no method known that is "good" in general for the
    problem. However, if all of the dictionary words are chosen from a
    natural-language, then we have a special (not general) case. I think
    in this special case a method like finding pairs, then combining pairs
    to triple, then triples to fives, fives to nines, etc, might work
    well, given obvious fallbacks to combining different-length sequences
    when at some length same-length combinations don't exist.

    *Ref 1 <https://en.wikipedia.org/wiki/Longest_path_problem#NP-hardness>
    *Ref 2 <https://en.wikipedia.org/wiki/NP-completeness>

    in Mathematica, it's easy to find THE Longest chain? >>>> is this a typical NP-complete problem?

    As noted in ref 2, "A problem p in NP is NP-complete if every other
    problem in NP can be transformed (or reduced) into p in polynomial
    time", so there is a sense in which every NP-complete problem is a
    typical NP-complete problem.
    ...
    word Chain with 3-letter overlaps (between successive words)

    10 (shtoomest, estrayed, yedo, edomitish,
    ish, ishime, imelle, ller, lerps, rps)

    I wrote some programs to look for word chains, first on the plan
    mentioned earlier -- finding pairs, then combining pairs to triples,
    triples to fives, etc which worked ok to find chains of up to 30 or 40
    words, but this turned out to be fairly slow and of course memory
    intensive, running out of memory after a few hours on my 64GB-ram
    system. Then I tried depth-first recursion, which in .1 second can
    find chains of 150+ words, using 6-letter dictionary words with
    3-letter overlaps, but taking 1.5 hours to find its first 171-word
    chain, 11 hours to find a 174-chain, 30 hours for a 177-chain, then no progress in the next 142 hours of computation. With a couple of minor program changes I got the 174 time below 6 minutes, and time to 177
    around 14 minutes, but no progress beyond that in the next 18 hours.



    Here's an 8-letter/4-overlap 10-chain example, based on an
    88-word dictionary of 8-letter words, all containing `over`:
    10 : ( discover overhang hangover overhung hungover overtake
    takeover overturn turnover overacts ) at 0.201194 seconds


    menomini + minidisc +

    ( discover overhang hangover overhung hungover overtake
    takeover overturn turnover overacts )



    Example 6-letter/3-overlap 177-chain:
    177 : abacus cuspid pidgin ginger gerbil billet lethal halter terser
    serene enemas mascot cotton tonsil silent entice icebox boxcar carbon
    bonbon bongos gospel pellet letups upshot hotbed bedbug bugles lessee
    seeped pedant anthem hempen pennon noncom combat bather herbal balder
    dermis misfit fitful fulcra craven vendor dormer merman manful fulfil
    filthy thymus muscat catnap napped pedlar largos gossip siphon honcho
    choice icecap captor torpor portal talent entrap rapper permit mitten
    tenpin pintos tosses sesame amebas basins instil tildes despot potash
    ashcan cancan cancel cellar larvas vassal salver verbal ballad ladles
    lessen sensor sorbet betcha chapel pelvis viscus custom tombed bedlam
    lambed bedpan pantry tryout outlay layout outran rancor cornea neared
    redden denser server vermin minute uterus russet settee teepee peewee
    weevil villas lassie siesta stamen mentor torque queasy asylum lumbar
    barfed fedora orator torrid ridden dentin tinsel selves vessel selfie
    fiesta stadia dialog logout outrun runway waylay layman mantra trains
    insult ultras rascal callus lustre tremor mortar tartar tarpon poncho
    chores resend endear earwig wigwam wampum pumper person sonnet nether
    hernia niacin cinder derail ailing ingest esters ersatz at 820.516 seconds

    Example 8-letter/4-overlap 79-chain:
    79: abjuring ringside sidekick kickback backbone bonehead headland
    landlady ladyship shipload loadstar starfish fishtail tailgate
    gatepost postdate dateline linefeed feedback backfire fireside
    sidelong longhand handbook bookcase casework workbook bookworm
    wormwood woodcock cocksure surefire firewood woodland landmark
    markdown downhill hillside sideshow showdown downturn turnover
    overcome comeback backhand handrail railroad roadkill killdeer
    deerskin skinhead headword wordplay playback backstop stopover
    overhand handsome somewhat whatever evermore moreover overhang
    hangover overhung hungover overtake takeover overdraw drawback
    backrest restrain raindrop dropouts outshone honester sternest
    nestling lingered at 54577.844678 seconds


    my PC is slow... but i got curious... (about 6-letter/3-overlap)

    ........, massig,signal,nalfon, +

    ........, reshun,hungry,gryfon, +

    95 ( fondak, dakota, otakus, kuskus, kussos, sossed, sedate, ateles, lesses, sestet, tethys, hyssop, sophia, hiatus, tuscan, canell, ellops,
    opsins, insoul, oulder, dermol, molten, tenure, uretal, talpas, passes,
    seskin, kindie, dietal, tallis, lisbon, bonagh, aghast, astony, onymal,
    malice, icecap, caplin, lingas, gassit, situps, upsend, endore, oregon,
    gonads, adsorb, orbits, itself, elfish, ishtar, tarpot, potboy, boyaux,
    auxins, inship, hippus, pusley, leymus, muslin, linsey, seyens, ensure,
    uredia, diamat, matlos, losels, elsins, insunk, unkill, illing, ingoes,
    oesogi, ogived, vedism, ismdom, domett, ettles, lessee, seeing, ingans,
    anshar, hardim, dimply, plying, inglut, lutzes, zester, ternar, narica,
    icarus, rushes, hestia, tiamat, mating, ingots )



    i think... A long (straight) chain like this (below) is harder to find
    (no dict for them)

    Survey -
    Monkey - Business - Casual - Sex - Work - Visa - Card - Shark - Tank
    - Top - Secret - Agent - Orange - Julius - Caesar - Salad - Dressing

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Waldby@21:1/5 to HenHanna on Sun Jul 7 20:41:41 2024
    XPost: sci.lang, sci.math

    In sci.math HenHanna <HenHanna@devnull.tb> wrote:
    On 7/6/2024 3:08 PM, James Waldby wrote:
    ...
    Here's an 8-letter/4-overlap 10-chain example, based on an
    88-word dictionary of 8-letter words, all containing `over`:
    10 : ( discover overhang hangover overhung hungover overtake
    takeover overturn turnover overacts ) at 0.201194 seconds

    menomini + minidisc +
    ( discover overhang hangover overhung hungover overtake
    takeover overturn turnover overacts )

    Example 6-letter/3-overlap 177-chain:
    177 : abacus cuspid pidgin ginger gerbil billet lethal halter terser
    serene enemas mascot cotton tonsil silent entice icebox boxcar carbon
    ...
    hernia niacin cinder derail ailing ingest esters ersatz at 820.516 seconds >>
    Example 8-letter/4-overlap 79-chain:
    79: abjuring ringside sidekick kickback backbone bonehead headland
    landlady ladyship shipload loadstar starfish fishtail tailgate
    gatepost postdate dateline linefeed feedback backfire fireside
    ...
    backrest restrain raindrop dropouts outshone honester sternest
    nestling lingered at 54577.844678 seconds

    my PC is slow... but i got curious... (about 6-letter/3-overlap)

    ........, massig,signal,nalfon, +

    ........, reshun,hungry,gryfon, +

    95 ( fondak, dakota, otakus, kuskus, kussos, sossed, sedate, ateles, lesses, sestet, tethys, hyssop, sophia, hiatus, tuscan, canell, ellops, opsins, insoul, oulder, dermol, molten, tenure, uretal, talpas, passes, seskin, kindie, dietal, tallis, lisbon, bonagh, aghast, astony, onymal, malice, icecap, caplin, lingas, gassit, situps, upsend, endore, oregon, gonads, adsorb, orbits, itself, elfish, ishtar, tarpot, potboy, boyaux, auxins, inship, hippus, pusley, leymus, muslin, linsey, seyens, ensure, uredia, diamat, matlos, losels, elsins, insunk, unkill, illing, ingoes, oesogi, ogived, vedism, ismdom, domett, ettles, lessee, seeing, ingans, anshar, hardim, dimply, plying, inglut, lutzes, zester, ternar, narica, icarus, rushes, hestia, tiamat, mating, ingots )

    Do you have a link to the dictionary you're using? Among the first 17
    words (for example) of that sequence, only sedate and hiatus are in
    linux's /usr/share/dict/american-english file.

    i think... A long (straight) chain like this (below) is harder to find
    (no dict for them)

    Survey -
    Monkey - Business - Casual - Sex - Work - Visa - Card - Shark - Tank
    - Top - Secret - Agent - Orange - Julius - Caesar - Salad - Dressing

    I think the basics of the problem are not more difficult, if a
    dictionary of collocations were to be found. I don't know of a free
    one, or whether the following would be accessible if paid for. https://www.oxfordlearnersdictionaries.com/us/definition/collocations

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From HenHanna@21:1/5 to James Waldby on Sun Jul 7 14:23:29 2024
    XPost: sci.lang, sci.math

    On 7/7/2024 1:41 PM, James Waldby wrote:
    In sci.math HenHanna <HenHanna@devnull.tb> wrote:
    On 7/6/2024 3:08 PM, James Waldby wrote:
    ...
    Here's an 8-letter/4-overlap 10-chain example, based on an
    88-word dictionary of 8-letter words, all containing `over`:
    10 : ( discover overhang hangover overhung hungover overtake
    takeover overturn turnover overacts ) at 0.201194 seconds

    menomini + minidisc +
    ( discover overhang hangover overhung hungover overtake
    takeover overturn turnover overacts )

    Example 6-letter/3-overlap 177-chain:
    177 : abacus cuspid pidgin ginger gerbil billet lethal halter terser
    serene enemas mascot cotton tonsil silent entice icebox boxcar carbon
    ...
    hernia niacin cinder derail ailing ingest esters ersatz at 820.516 seconds >>>
    Example 8-letter/4-overlap 79-chain:
    79: abjuring ringside sidekick kickback backbone bonehead headland
    landlady ladyship shipload loadstar starfish fishtail tailgate
    gatepost postdate dateline linefeed feedback backfire fireside
    ...
    backrest restrain raindrop dropouts outshone honester sternest
    nestling lingered at 54577.844678 seconds

    my PC is slow... but i got curious... (about 6-letter/3-overlap)

    ........, massig,signal,nalfon, +

    ........, reshun,hungry,gryfon, +

    95 ( fondak, dakota, otakus, kuskus, kussos, sossed, sedate, ateles,
    lesses, sestet, tethys, hyssop, sophia, hiatus, tuscan, canell, ellops,
    opsins, insoul, oulder, dermol, molten, tenure, uretal, talpas, passes,
    seskin, kindie, dietal, tallis, lisbon, bonagh, aghast, astony, onymal,
    malice, icecap, caplin, lingas, gassit, situps, upsend, endore, oregon,
    gonads, adsorb, orbits, itself, elfish, ishtar, tarpot, potboy, boyaux,
    auxins, inship, hippus, pusley, leymus, muslin, linsey, seyens, ensure,
    uredia, diamat, matlos, losels, elsins, insunk, unkill, illing, ingoes,
    oesogi, ogived, vedism, ismdom, domett, ettles, lessee, seeing, ingans,
    anshar, hardim, dimply, plying, inglut, lutzes, zester, ternar, narica,
    icarus, rushes, hestia, tiamat, mating, ingots )

    Do you have a link to the dictionary you're using? Among the first 17
    words (for example) of that sequence, only sedate and hiatus are in
    linux's /usr/share/dict/american-english file.


    Do you prefer that these words be excluded?
    dublin, otaku, otakus, shinju, ohtani,
    jonathan, eliza, mary, Biden, boston, oregon


    i think... A long (straight) chain like this (below) is harder to find
    (no dict for them)

    Survey -
    Monkey - Business - Casual - Sex - Work - Visa - Card - Shark - Tank
    - Top - Secret - Agent - Orange - Julius - Caesar - Salad - Dressing

    I think the basics of the problem are not more difficult, if a
    dictionary of collocations were to be found. I don't know of a free
    one, or whether the following would be accessible if paid for. https://www.oxfordlearnersdictionaries.com/us/definition/collocations



    i've used a bigger dict before, but stopped using that one,
    when i realized that it was giving me so many words
    i'd never seen and will never use or learn.

    https://github.com/possibly-wrong/word-frequency/blob/main/word-frequency.txt



    _________________ Rearrange the words into a chain of 2-word phrases:

    Rocky Break Road Point Compound
    Game Eye Open Level



    (Hint: the ends (Rocky ... Level) don't move)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)