• thread or fork for an network server, what is the better way ?

    From Michael Niehren@21:1/5 to All on Wed Jun 1 08:38:08 2022
    Hi together,

    i am currently using a tcl-network-server under linux, that do a fork for
    every new connection, which then handles the request.

    As i see the tasks-Modul from ET here on the list, i'm thinking about to
    switch to threads to handle the request in a thread-pool and hope to get
    speed improvements.

    So, what do you think of it, what is the better way ?
    Is it expected to have speed improvements when using a thread pool in
    contrast to fork on every new incoming connection and will the improvements
    be big enough to legitimate the effort for switching ?

    Currently i have 1 binary with all procedures for handling the request included, which then forked. As far as i know, if i switch to threads, i
    have to import all my defined procedures in every new starting thread, so
    i have to split my binary into 2 parts. Is that right or is there a simple
    way to define all procedures of the current running process in a thread
    that this process starts ?

    best regards
    Michael

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Michael Niehren on Wed Jun 1 13:22:25 2022
    Michael Niehren <michael@niehren.de> wrote:
    Hi together,

    As i see the tasks-Modul from ET here on the list, i'm thinking about
    to switch to threads to handle the request in a thread-pool and hope
    to get speed improvements.

    The only way to really know is to put in the effort for a rewrite and
    measure the difference.

    However, with Linux, the time difference between launching a new thread
    and forking (at the C level) is minimal (vs say in windows where the
    windows equivalent to fork is order of magnitude slower than launching
    a thread). The result of this minimal difference under Linux likely
    means that Tcl interpreter overhead will dominate both versions to the
    extent that you see little measurable difference in speed.

    So, what do you think of it, what is the better way ?
    Is it expected to have speed improvements when using a thread pool in contrast to fork on every new incoming connection and will the improvements be big enough to legitimate the effort for switching ?

    Without putting in the effort to rewrite, and then measuring, no one
    can know. But given my paragraph above, I predict you'd not see a huge difference. With one exception. If requirements changed such that the
    current forked processes suddenly now need to share a data structure.
    Shared access to a common data structure will likely be much faster in
    Tcl with the Threads package and the TSV (Thread Shared Variables)
    allowing sharing of that data vs. trying to share among forked
    processes.

    Currently i have 1 binary with all procedures for handling the request included, which then forked. As far as i know, if i switch to threads, i
    have to import all my defined procedures in every new starting thread,

    Thread pool's allow you to supply an init script that is executed each
    time a new pool member is spun up. It should "initialize" the thread
    to the point that it is a productive member of the pool.

    If you are using the raw thread::create call, then you have to do
    whatever initialization (module loading, defining procs/objects, etc.)
    is necessary. thread::create just hands you a raw interpreter without
    any initialization/module loading that you might have done in another
    thread. With that said, thread::create does take a "script" argument
    that is meant to be the "initalize this thread" script that performs
    that work. But you do have to repeat the work with each new thread.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From et4@21:1/5 to Michael Niehren on Wed Jun 1 11:56:18 2022
    On 5/31/2022 11:38 PM, Michael Niehren wrote:
    Hi together,


    Currently i have 1 binary with all procedures for handling the request included, which then forked. As far as i know, if i switch to threads, i
    have to import all my defined procedures in every new starting thread, so
    i have to split my binary into 2 parts. Is that right or is there a simple way to define all procedures of the current running process in a thread
    that this process starts ?


    The first question I'd ask is: are you faced with performance issues
    now? Is your program running out of steam or are you just looking to
    improve something that's already working.

    One difference between thread pools and tasks, is with tpool it has an
    upper and lower boundary on the number of threads in the pool. Tasks
    allocate but one set at startup. While it is possible to add more or
    reduce the number of them, there is no support for that at present and
    none likely in the future. That was one complication I decided was not
    worth the trouble, but that's just my opinion.

    If you decide to use tpool, you can set the upper and lower to the same
    number and it will not (afaik) allocate any more or kill off any of the existing threads, and so there won't be any new importing of code into a
    new thread, since you'll just be reusing the ones you have. Then it
    should work like tasks. And what sort of importing do you think you are
    going to need?

    Tasks have a proc re-constructor, and it can take several. If you
    specify just * as one of the elements in the import list argument, it
    will use [info proc *] and reconstruct each proc. Likewise if you have
    these in namespaces, so you could do name::* as an element. If you have
    other inits to do, say TCLOO, then you would have to import them
    differently. I've often wondered if TCLOO can be completely introspected
    so it can be imported into a thread. I don't know enough about it, and I personally don't use TCLOO so I can't speculate on that.

    With tasks, you can have a script variable, i.e. set script {...} and
    then specify -$script as one of the initializers. Tasks allow any number
    of these along with any number of wildcards that an [info proc pattern]
    can take. Tpool has a single argument for that, but you could probably
    easily build several into a single one.

    I've not used ttrace, but it would appear that it's purpose is similar,
    but seems to do other things as well.

    As to performance, do you fork off a process for each connection or do
    you keep them around for additional ones? What does each fork do? Do
    they talk to each other?

    As to resources, I've estimated the cost of a new thread in a rather
    crude method: On 32 bit windows, I could only do about 150 given the 2gb address space limit. So, on the order of 10-20 mb per thread. You could
    do some easy tpool tests. On 64 bit this likely won't matter, what with
    cheap ram these days.

    I know that tsv is reasonably fast, because I've measured the amount of
    time it takes to give a task work (and it does it via tsv), and it's on
    the order of 50 microsecs, where a proc call is about 1 microsecs (on my
    4ghz 4090k intel chip). How reasonable this is depends on how much work
    you do in each call. It would not be worthwhile to use tasks to compute anything that can be done with a single proc call in say, 100 or less microsecs. I also found that using thread::send sync was about 1/3 the
    cost of doing task calls.

    One thing you might do if it won't cause your program to crash is to
    have your forked processes simply bypass any workload. Sort of like
    putting a return at the top of some proc you want to measure, and run it
    both ways, one time to do the real job and another to reduce it to no
    work, so you can measure the overhead.

    This can work as long as you don't need to compute anything. Another way
    is to just have a canned answer to simulate your workload.

    Anyway, you likely need to know the cost of each connection vs. the cost
    of what you do in each connection. With that, you can then probably know
    if it's worth switching to threads or tasks. If you do a lot of work in
    each connection, I'd stay with what you got since it works. And as Rich
    said, it will also depend on any inter thread/process communication you
    are doing, if any.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Niehren@21:1/5 to All on Wed Jun 1 23:23:56 2022
    Hi ET and Rich,

    many thank's for your remarks.

    My network server with fork run's currently very well without performance issues. So there is no need to switch to thread or tasks. It was only an thought for an improvement as i read about your tasks module.

    I think, if i find time, i will take my netserver-code with fork, slim it to
    an minimum and then measure 5 connections. After that i try to change my
    code for using tasks or thread and measure again 5 connections. So i can
    realy compare both.



    On 5/31/2022 11:38 PM, Michael Niehren wrote:
    Hi together,


    Currently i have 1 binary with all procedures for handling the request
    included, which then forked. As far as i know, if i switch to threads, i
    have to import all my defined procedures in every new starting thread, so
    i have to split my binary into 2 parts. Is that right or is there a
    simple way to define all procedures of the current running process in a
    thread that this process starts ?


    The first question I'd ask is: are you faced with performance issues
    now? Is your program running out of steam or are you just looking to
    improve something that's already working.

    One difference between thread pools and tasks, is with tpool it has an
    upper and lower boundary on the number of threads in the pool. Tasks
    allocate but one set at startup. While it is possible to add more or
    reduce the number of them, there is no support for that at present and
    none likely in the future. That was one complication I decided was not
    worth the trouble, but that's just my opinion.

    If you decide to use tpool, you can set the upper and lower to the same number and it will not (afaik) allocate any more or kill off any of the existing threads, and so there won't be any new importing of code into a
    new thread, since you'll just be reusing the ones you have. Then it
    should work like tasks. And what sort of importing do you think you are
    going to need?

    Tasks have a proc re-constructor, and it can take several. If you
    specify just * as one of the elements in the import list argument, it
    will use [info proc *] and reconstruct each proc. Likewise if you have
    these in namespaces, so you could do name::* as an element. If you have
    other inits to do, say TCLOO, then you would have to import them
    differently. I've often wondered if TCLOO can be completely introspected
    so it can be imported into a thread. I don't know enough about it, and I personally don't use TCLOO so I can't speculate on that.

    With tasks, you can have a script variable, i.e. set script {...} and
    then specify -$script as one of the initializers. Tasks allow any number
    of these along with any number of wildcards that an [info proc pattern]
    can take. Tpool has a single argument for that, but you could probably
    easily build several into a single one.

    I've not used ttrace, but it would appear that it's purpose is similar,
    but seems to do other things as well.

    As to performance, do you fork off a process for each connection or do
    you keep them around for additional ones? What does each fork do? Do
    they talk to each other?

    As to resources, I've estimated the cost of a new thread in a rather
    crude method: On 32 bit windows, I could only do about 150 given the 2gb address space limit. So, on the order of 10-20 mb per thread. You could
    do some easy tpool tests. On 64 bit this likely won't matter, what with cheap ram these days.

    I know that tsv is reasonably fast, because I've measured the amount of
    time it takes to give a task work (and it does it via tsv), and it's on
    the order of 50 microsecs, where a proc call is about 1 microsecs (on my
    4ghz 4090k intel chip). How reasonable this is depends on how much work
    you do in each call. It would not be worthwhile to use tasks to compute anything that can be done with a single proc call in say, 100 or less microsecs. I also found that using thread::send sync was about 1/3 the
    cost of doing task calls.

    One thing you might do if it won't cause your program to crash is to
    have your forked processes simply bypass any workload. Sort of like
    putting a return at the top of some proc you want to measure, and run it
    both ways, one time to do the real job and another to reduce it to no
    work, so you can measure the overhead.

    This can work as long as you don't need to compute anything. Another way
    is to just have a canned answer to simulate your workload.

    Anyway, you likely need to know the cost of each connection vs. the cost
    of what you do in each connection. With that, you can then probably know
    if it's worth switching to threads or tasks. If you do a lot of work in
    each connection, I'd stay with what you got since it works. And as Rich
    said, it will also depend on any inter thread/process communication you
    are doing, if any.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Thu Jun 2 10:21:03 2022
    * Michael Niehren <michael@niehren.de>
    | I think, if i find time, i will take my netserver-code with fork, slim it to | an minimum and then measure 5 connections. After that i try to change my
    | code for using tasks or thread and measure again 5 connections. So i can
    | realy compare both.

    While I have no experience in the subject, I somewhat doubt that you
    will be able to measure any significant difference with 'only' 5
    connections. 500 connections or more, and there will actually be some
    work to do...

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)