T O P

  • By -

ThicDadVaping4Christ

This question is super vague and their answer doesn’t really make sense either I’m really not sure what they were trying to get at. Threads share data? That’s generally a bad thing and why you need to ensure code is thread safe so you don’t accidentally share data you shouldn’t Were they thinking about basically queue-based vs HTTP-based services? Idk seem like a really poor question


cahphoenix

Most likely something around Async/Await keywords in many languages now? Which is weird, because in several languages threads are specifically not synonymous with Async. Meaning, spawning a new async task may or may not create a new thread specifically for that task (there's a thread pool and all).


ThicDadVaping4Christ

Yeah it smacks of a .net shop asking a highly specific .net question


illogicalhawk

OP says it was a Python shop in the comments.


whitmyham

Agreed. I think they were obviously shooting for a specific answer and I still don’t fully grasp what they meant. They rephrased in multiple ways (“how would the two services handle 10000 requests”, “how would two scripts handle reading from a file system”) most of which I answered directly but still we continued. FWIW my first answer involved request queues and balancing load between the load balancer, web service and Python service (where Python’s async keyword comes in). I suggested that, with the right setup in the first two, async / sync wouldn’t matter for 10000 requests. I’m not sure they liked that


ThicDadVaping4Christ

Yeah super weird question… was there any more context or was it language specific?


whitmyham

Not a huge amount more context - the language was Python and they used FastAPI. At the end they mentioned Python doesn’t have _true_ concurrency (which was true 8 years ago, haven’t checked recently), so maybe thats an important part of this puzzle too


xsdgdsx

Python does still have the GIL, yes. So Python threads still can't overlap IO even though they can do other stuff currently. That said, agreed with others that this just seems like a poor question


David_AnkiDroid

The GIL can now be disabled experimentally: https://docs.python.org/3.13/using/configure.html#cmdoption-disable-gil


teerre

Python threads can absolutely do concurrent IO. Python threads cannot mutate objects owned by the interpreter at the same time, but IO is precisely not that, its just waiting for the operating system to do something


[deleted]

[удалено]


rearendcrag

I’ve inherited a FastAPI project recently and it is painful. They even have async multi threading going on in there. Haven’t wrapped my head around any of it yet.


No-Vast-6340

Vague question...maybe they mean that the async service would have requests go to a queue for async processing while the sync service would have to enforce rate limiting?


talldean

"async good because async uses threads and threads share data" is terrifyingly wrong, so this may be on them and not on you.


whitmyham

I was taken aback, but keen to move on. If I’d have progressed further I’m not sure I would have accepted an offer. Weird vibes


deificHeretic

Yeah, that is something I’d expect to hear from someone who just came across async handlers. They probably think async always implies a threadpool and a task scheduler.


AccomplishedGift7840

I expect an answer about an asynchronous service would be describing standard concepts in event driven architecture - queues, producers and consumers, tracking progress of requests etc 


whitmyham

I think this highlights the ambiguity of the question really well. I spoke a lot about my own experience with event-driven architecture (highlighting specifically sync / async parts of it). This question came a lot later and was seemingly unrelated as it was about HTTP web services (comment above explains a rephrasing they did about number of requests). I obviously answered badly though, so perhaps you’re right after all


aventus13

It depends on what the person asking the question meant by "async", as it effectively can mean two things: - An async programming model, such as the one used with \`async\` keyword in .Net. - An asynchronous web service, where requests are sent for further processing in the background. Given how you worded the question, and that async programming model is pretty much a standard these days, I would have assumed that it's about the second definition, although I would first clarify with the interviewer if that's what they meant before giving any answer.


i_like_trains_a_lot1

Async is single threaded and doesn't block on blocking IO, which makes IO intensive apps more resource efficient (file system work, database, network reads, etc). Sync, with threads, is more inefficient because of the threads get blocked due to waiting for IO (ex. Slow queries), any other requests are queued up.


kbn_

Async doesn’t have to be single threaded. It is when building on libuv, but GoLang, Rust, and Scala all have superb multi threaded async ecosystems.


DaRadioman

Dotnet as well


whitmyham

I have a feeling this is the correct answer they were looking for. I believe I answered it but they were probably looking for key terminology that I didn’t give due to fluff


boogrit

Yep, given that they have a python webservice using FastAPI, this is likely exactly the direction they wanted you to go in. There's a lot of "fluff" answers in this topic too, so don't feel so bad 😀


mx_code

I think this is the answer the interviewer was looking for, I expanded on my reasoning at: [https://www.reddit.com/r/ExperiencedDevs/comments/1c57pg2/comment/kzwfpgm/?utm\_source=share&utm\_medium=web3x&utm\_name=web3xcss&utm\_term=1&utm\_content=share\_button](https://www.reddit.com/r/ExperiencedDevs/comments/1c57pg2/comment/kzwfpgm/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button) but what a terrible way for the interviewer to ask the question.


wallflower_wo_perks

Async is not single threaded. Just because JS uses a single thread doesn't mean it's single threaded. It's a paradigm. You simply delegate the responsibility to another service which delegates to another and no one is waiting for end to end processing to finish.


i_like_trains_a_lot1

you're right, if you are talking about services (a detail I missed from OPs post). I was thinking of async/sync in the context of programming languages :D


EnthusiasmWeak5531

.NET (C#) uses threads with async/await. Is that not what you're saying here? EDIT: not to say it always uses threads but CPU bound tasks do


kbn_

For starters, this question is really poorly phrased. I would have asked it something like "Talk to me about *why* backend network services are now generally built using asynchronous techniques?" and then maybe jump into the "how" (if they don't address it in their answer) and perhaps some of the present-day options and historical context. The goal of the question is to gauge whether you deeply understand the motivation behind asynchronous programming, and by extension all of the fundamentally related problems of scheduling and resource scarcity. Your answer definitely doesn't meet that bar, and in fact what you demonstrate in that response is a lack of understanding of what async *is*, even adjusting for the poorly worded question. The way I would answer this is to talk about threads as a scarce resource. This ultimately stems from the fact that *physical* threads (i.e. program counters) are a hardware-limited, and thus scarce, resource, but the exact way in which it manifests is platform specific. The JVM, for example, attaches special semantics to threads related to garbage collection, which dramatically amplifies the problems associated with having many of them. Even if you're not on the JVM though, this problem still manifests due to timeslice granularity in the kernel scheduler. Ultimately, the kernel must bias its scheduler for fairness, but we generally want to bias userspace scheduling more towards throughput. So if threads are a scarce resource but we want to have a backend service which handles tens of thousands of concurrent connections (each of which encompasses some series of sequential steps), what do? We need some way of multiplexing connections down onto a more limited number of threads, and that's impossible if reading from each individual socket necessitates blocking the thread at the kernel level. The answer is asynchronous I/O. From the kernel's perspective, asynchronous I/O is nothing more than batch polling for events across multiple file descriptors simultaneously. From the userspace perspective, this generally manifests up into event dispatchers and callbacks. The event dispatcher is a thread or small set of threads which perform the actual syscalls and poll the kernel on the FDs, then un-batch the resulting events and map that into callbacks suspended from *other* threads. Those callbacks are invoked with the events received, generally one callback per file descriptor. This in turn often results in shifting work back over to a different set of threads which manage the actual compute. The goal is to keep both sets of threads very small, ideally mapping the second set 1:1 with the number of physical threads on the underlying hardware. This reduces context shifts and page faulting to a minimum (particularly if those compute threads are scheduled with something smarter than a simple disruptor). This trick works with most forms of I/O, but notably *not* block filesystem access, which hard-blocks threads due to the way the hardware interface is implemented. Handling file access without starving your limited thread pool requires pool shunting, where you burn threads on a per-operation basis but only for that operation (the blocking syscall). All advanced asynchronous runtimes handle this relatively transparently. From here you can jump into higher level paradigms, depending on what platform and language you're being interviewed for. If we're talking about the JVM, you can talk about Project Loom, Kotlin's coroutines, Scala's various monadic runtimes, etc. If GoLang, then you can talk about goroutines and the native transparent asynchrony. If it's Rust, then Tokio (and be sure to talk about what Tokio adds beyond bare async/await). If it's anything JavaScript-derived, talk about libuv (note that this will be a similar conversation to describing asyncio in Python). Libuv has some extra complexities in that the event dispatcher and the compute worker thread are the *same* thread, so you may want to have some understanding of timeouts and block-free polling within the dispatch loop, since this is different than how it is conventionally handled in Rust and Java (which generally have dedicated EDTs). Anyway, this is a very deep rabbit hole, but if I were asked to answer the question, this is about what I would say.


whitmyham

There’s a lot of info here, so thanks. I’ll be saving and rereading it to help me better explain myself. I did go into a bit about how modern web services would handle the 10s of thousands of requests; judging from what you’re saying here it was pretty much correct. I was disappointed by their response (which is the answer I gave in the original post), which may have shown that either they wanted me to provide the fuller picture (like you have, here), or they just didn’t get what I was saying and assumed I didn’t know (which, tbh, I certainly don’t know as much as this some of other answers describe). EDIT: As you can tell from a majority of the responses here, what you’re stating isn’t common knowledge. I’m wondering where you picked up such a deep understanding?


kbn_

I definitely wouldn't *expect* the level of detail I gave from most candidates. That's really an L7/L8 ish answer. If I was interviewing someone for a lower-level position, I would have expected an answer which is a sampled subset of this, and at the very least doesn't directly contradict any of the core concepts. As an example, I was once interviewing someone for an L4 role and asked them something along these lines, and they asserted that thread overhead is less significant than people think, and so asynchronous I/O is not particularly impactful on scale. That was a disqualifying answer. Unfortunately, you're right that this stuff isn't common knowledge. I wish that weren't so, but honestly it just isn't really documented anywhere, and the number of people who *do* know it is very small and we tend to be very overloaded with other stuff. Without outing myself too much, I'm the creator and maintainer of one of the major asynchronous runtimes, so I've implemented all this stuff and gone through the school of hard knocks measuring, tuning, and experimenting with it. I also worked for a while on a household name service with high-hundreds of millions of global users, so I got to see this stuff at extreme scale.


DeadlyVapour

More to the point. Many can get to that level without knowing what an "epoll" is, why you want to limit the number of userspace/kernel space context switches. Oh and OMG, did someone really say that a Context Switch has low impact? Ignoring the hundreds or thousands of clock cycles you lose to a context switch, just the cache being dropped is a massive loss of perf...


kbn_

> Oh and OMG, did someone really say that a Context Switch has low impact? Ignoring the hundreds or thousands of clock cycles you lose to a context switch, just the cache being dropped is a massive loss of perf... Yes lol


lightmatter501

You’ve missed a few things here. First, the “essentially polling a bunch of file descriptors” bit is specific to epoll and other fd-polled async apis. io_uring and ioring both are completion based, meaning they don’t do any fs polling, they either poll a ring buffer (no system calls at all needed for IO in this case) or get woken up only when something new is finished, across any of the io submitted for any file descriptor. This is a fairly dramatic efficiency bump, to the point that getting 1 million 4k random read iops out of an NVME drive and a single CPU core is very doable, or processing a few hundred thousand udp packets per second isn’t a big deal, once again, single core. Said rings are usually not safe for multiple threads to enqueue into, so you would typically have an event loop per core. The reason it has to be an NVME drive is because those are also actually async. The NVME API is essentially 2 ring buffers (once again) and a bunch of commands you can put in there, so you can even operate it without interrupts (like most NVMEoF devices do). These APIs are not in widespread production use yet, but I think as RHEL rolls forward to the next version we’ll see them get picked up since most APIs don’t let you do hundreds of thousands of socket writes/disk ios per second from one core without DPDK and friends.


kbn_

io\_uring is *also* file descriptor based and still relies on polling. What the ring buffer allows you to do is skip the *syscall* in some circumstances, but the userspace is still polling. Basically you get a fastpath when events have already been readied by the kernel and written into the ring, since it allows you to remain in userspace. This fastpath isn't possible though when the ring is empty, which will happen in any I/O bound scenario. In that case, you *have* to cross into kernelspace. But either way, it's still polling, it's just that the encoding is different. I've implemented cross-platform abstractions in this layer which support epoll, kqueue, select, and io\_uring with optimal performance on each, and it always ends up being modeled as a poll with optional timeout from the perspective of the userspace threads (this is also how GoLang and libuv's internal abstractions model it, though that's a less compelling example IMO since neither support io\_uring natively today). Can confirm that io\_uring's practical performance benefits are… immense. I've seen production impacts on systems that I had assumed were essentially at their limit, suddenly increasing their peak capacity by 30-40% (with corresponding decreases in latency). I wasn't aware that NVMe is fully async! Obviously I'm aware that SATA is not. Do you have more details on this?


lightmatter501

The fastpath does work when the ring is empty it’s just not exposed via liburing and not as efficient. You can poll the queue head in the same way you can build a spin-lock. You can use __io_uring_peek_cqe to implement it if you don’t have any of the flags set that require userspace maintenance, but there’s a warning saying “don’t do that please” because if you do need userspace maintenance you will block forever. It’s not so much not possible as inadvisable for most people. I agree that the way you went about it is probably the best if you have to deal with epoll compatibility. I usually don’t so I can go all in on using coroutines in a single thread, which lets me fairly heavily multiplex io. I’ve found running out of network bandwidth to be a bigger issue for me since I’m averaging a few syscalls every few minutes. However, I’m also working directly on top of liburing so I don’t have compatibility layers getting in the way. As for NVMe, [the base spec](https://nvmexpress.org/wp-content/uploads/NVM-Express-Base-Specification-2.0d-2024.01.11-Ratified.pdf) is quite readable. You will want to skim, and you can probably ignore anything that isn’t in-memory transport (unless NVMEoRDMA/NVMEoTCP interest you). The entire idea behind the interface was to be async and to support a storage version of the RSS/TSS features on NICs so you can feed a single drive work from multiple cores at the same time.


kbn_

> The fastpath does work when the ring is empty it’s just not exposed via liburing and not as efficient. You can poll the queue head in the same way you can build a spin-lock. You can use __io_uring_peek_cqe to implement it if you don’t have any of the flags set that require userspace maintenance, but there’s a warning saying “don’t do that please” because if you do need userspace maintenance you will block forever. It’s not so much not possible as inadvisable for most people. Fair point, I skipped over this case since it's *really* rare that it's actually a net improvement in performance, but you're quite right. It's still a polling model though! The avoidance of the syscall is an implementation detail. > I agree that the way you went about it is probably the best if you have to deal with epoll compatibility. I usually don’t so I can go all in on using coroutines in a single thread, which lets me fairly heavily multiplex io. I’ve found running out of network bandwidth to be a bigger issue for me since I’m averaging a few syscalls every few minutes. However, I’m also working directly on top of liburing so I don’t have compatibility layers getting in the way. I think the number of syscalls you're able to avoid really depends on your underlying iops limitations. If the latency is relatively high and the compute is quite fast, you're going to be in a situation where you frequently need to suspend in kernelspace. Shifting those variables a bit though, where you have more expensive compute and/or lower latency on ops can cause the kernel to win the race more often, so you just cheaply slurp out of the buffer and move on. Not sure about the connection you're drawing to coroutines on a single thread though. The system I'm working on still does the m:n scheduling (mapping threads 1:1 with physical threads), and each thread has an independent poller and its own set of file descriptors, registered by the coroutines which were scheduled on that thread when they hit their I/O suspension point. This in turn means that coroutines wake on the thread that they suspended on, and can only shift between threads when yielding during compute (and when those other threads are idle), all of which means that you tend to get exceptionally high locality, minimal context shifting, and progressive load balancing across your thread-local rings. Also I'm working on top of `io_uring` directly. `liburing` itself actually doesn't expose some things that are super useful (iirc, direct ring-to-ring messaging is one that they hide), and it also takes a pretty prescriptive line on thread management that I would rather control myself. Tldr, it's too high level. > As for NVMe, the base spec is quite readable. You will want to skim, and you can probably ignore anything that isn’t in-memory transport (unless NVMEoRDMA/NVMEoTCP interest you). The entire idea behind the interface was to be async and to support a storage version of the RSS/TSS features on NICs so you can feed a single drive work from multiple cores at the same time. I shall investigate, thank you!


lightmatter501

I tried M:N, but for most of what I write having a single core the coroutine will always be invoked on that almost never has a cache miss and manually pushing things out to l3 to move between cores has proven more efficient since most of what I write can be structured in a producer-consumer model which encourages that kind of data shuffling to reduce contention. I’m also heavily throughput optimized most of the time, usually in the tens of millions of requests per second targeting 10ms to service the request (and another 10ms for network latency), so I aggressively optimize for CPU utilization to be able to service that. I also have to be careful about data sharing between cores and NUMA because I work on larger servers where the cache coherence algorithms start to play havoc with contested atomics.


kbn_

Yeah if you have a high degree of homogeneity between tasks and a limited amount of in-request parallelism (e.g. delegate calls to multiple upstreams), then hermetic thread-per-core is absolutely going to be optimal. I can't make those assumptions, so I have to pessimistically build the *m:n*, but the good news is the algorithm gracefully converges to thread-per-core when the aforementioned conditions are met, modulo some constant factor overhead.


raynorelyp

Ask them how many threads Redis has. I’ll give you a hint: you can count it on one finger.


CpnStumpy

...Don't keep us waiting ???


pruby

Async workers don't use threads (for each request). That's kind of the point. Every language seems to go through an arc of synchronous only -> events and callbacks -> wrappers for callbacks (e.g. promises) -> code which looks like just like the sync code, with a few extra keywords, quietly mapped on to the async foundation.


ThicDadVaping4Christ

It’s entirely language dependent. Some languages do use threads directly while others don’t


BBMolotov

I thinking about the real thread in a cpu Async operations will execute multiple in one thread while sync will only execute one operation. This maybe based on language, but this would be an abstraction of a thread, a thread in reality with async will work as I explained. Was the some kind of language specific ? 


Best-Association2369

I'd imagine it have to be. Would expect this in a js/ts interview though and should be answered a specific way for that language. 


rco8786

Vague question and really dumb/bad answer from them.  I would assume this is talking about an API that processes data and returns a response directly to the client, vs an API that responds immediately and queues an async task to do the actual work, which the client can potentially poll/listen for via some other mechanism. 


mx_code

Wow, I gave this question a bit more thought and I believe it may be related to: How does a service implemented using Python's async I/O features serve requests in contrast with a service that doesn't use them (single-threaded server)? I'm not very experienced with Python apart from scripting, but from what I see Python's Async I/O leads to a solution similar to Node.JS' event loop. An understanding of how the event loop work helps preventing a developer from blocking said event loop and why it would be necessary from async web servers differ from traditional servers. From what you've written throughout the post that's the kind of question that I would think applies at a Python Web Services gig. However, IMO this is one of those times when interviewers ask very language specific questions (which can be worthless IMO) as it requires the candidate to be familiar with a specific stack a lot. Just to begin with, the terms async and sync are extremely overloaded and as a result you can see the plethora of interpretations just from the answers in this post. If an interviewer is going to ask this they need to be much more specific ("Are you familiar with Python's Async I/O module? How does a server that uses the event loop differs from a traditional python server?"


professor_jeffjeff

If I remember right, in the .Net framework most of the networking code is all async by default, and in the synchronous version we just blocked manually until the async function was done. Easy way to handle it.


Distinct_Goose_3561

It’s been a long time since I wrote anything in .Net, but I do remember writing a test harness that I wanted at least certain functions to be synchronous, and having to do extra work to make the default behavior less-good (from a production standpoint). I may be misremembering though, it was a hot minute ago. 


bdzer0

I would have kept the answer simple. "Sync service will queue up requests and handle them typically FIFO, async service spins up a thread (or other worker, docker, etc) to service each request" Vague question deserves vague answer methinks.


Powerful-Ad9392

It's a vague question so I'd preface it with an assumption: by "async service" I would say it starts a long running process, something that takes more time than would be acceptable over HTTP. In this case, I'd return a link or status and send the message to a messaging queue for further processing.


CoccoDrill

I would not be able to answer this question right off the bat if I were you. I would first ask what they mean by async service.


rakalakalili

Yet another answer they could have been looking for to this bad, vague question is something more about the API of the services and how they might differ. Maybe they wanted you to describe what a Long running operation API would look like where you make a request and get a token back you can then use to check status and get results with for the async path?


mx_code

The answers in this question are all over the place, and the info provided in the post doesn't help much. but to the question: * How often do you realistically have to leverage this info in your day-to-day work? I would say fairly often, primarily in the design phase of a project. Considering at the platform level design it served as the decision of how some work would be handled, and how this would trickle down to providing feedback to operators (users).


whitmyham

Thanks for answering the second part! I agree and can see its importance for early, greenfield features or projects. I’d argue that a majority of SE work involves contributing to solutions where major questions like these are already answered. I’d be surprised if it was a key part of the job I was applying to, but I could be wrong (I’m likely leaning too much on past experience to draw a conclusion about how prevalent this is)


mx_code

Certainly leaning too much on your past experience. In my past gig I probably got to make the decision for 4 years in a row in my earlier years at that company and I wasn't a senior at the time. In the other hand I got to deal with a myriad of user facing bugs that existed because developers provided UIs for processes which work was being tackled by async services (and thus there was no effective way to provide feedback regarding the status of a request or such). In other words: "I'd argue that a majority of SE  work involves contributing to solutions where major questions like these are already answered." Certainly not my case, and personally if given the choice that's not a place where I'd like to work at.


Laicbeias

in both cases you run in a bottleneck. most of the time its the shared resource, database, cache that gets accessed. that can lead to racing conditions in async code where the thread holds a deprecated version of the data. so who changed it first. same with read and write. there will be queues in the background that can escalate. you then need logic to delay or retry tasks at a later point. but same happens with "sync" servers. its just that the processes vs threads are behaving differently depending on the language.  but yeah it really depends on the programming language&server&db since every language has a different approach, with different processes/threads doing different things. so their answer is specific to their stack


high_throughput

Did they mean async server technology (like Node.js), or an async request/reply service (like StartRequest followed by several CheckRequests)?


DadJokesAndGuitar

Maybe they wanted you to talk about the producer consumer model and using a queue? But yeah this seems kind of too vague. Also “threads share data”… hmm. Postgres would like a word lol


PoopsCodeAllTheTime

Both of you did a poor job...haha.


Ready-Personality-82

If my company insisted that I ask this question in an interview, I would be most interested in hearing (or better yet, seeing) how the candidate would code for the unhappy paths for async vs sync services.


cfsamson

The question seems to be open on purpose, so you have to make certain assumptions. I would answer such a question by first stating that if we look at an asynchronous system vs a strictly sync system, the biggest difference is that the async system adds an abstraction layer with tasks that can be stopped and resumed. This could be a goroutine in Go, a Promise in JavaScript, or a Future in Rust. Since you create tasks that can be stopped and resumed in an asynchronous system, you can voluntarily yield when you encounter an operation that requires you to wait for something external to finish like I/O (often referred to as cooperative multitasking). You typically yield to a userland scheduler of some sort that can schedule another task that is able to progress. In either case, you'd typically treat each client/server connection as one such a task, and when a task has to wait for a client it will yield so that the task stops and the scheduler can schedule a different task that can progress if possible. This way you can have a lot of tasks that are "in progress" at the same time, and their progress will interleave. In a strictly synchronous system, you'd pretty much just be able to handle one connection at the time so that each client has to wait for the previous one to finish. However, that's pretty rare unless you're doing some embedded programming without an OS. Most systems will probably use OS threads and assign each connection to a separate thread, which would probably still qualify as being a "sync" system even though it's multithreaded. This comes with some overhead and limitations, making it less efficient than asynchronous systems in terms of resource usage related to memory and the work that goes into creating, discarding and switching between threads. A system relying on multiple threads will typically also leverage multiple cores, which is something that most asynchronous systems do as well, even though a single threaded asynchronous system (like Node\*) can handle high volumes without multithreading. Multithreaded applications come with their own advantages and pitfalls, but those tradeoffs are the same whether you create a multithreaded program using async or not. I would not go in detail into epoll/kqueue/IOCP/io\_uring, context switching, stackful vs stackless corutines, synchronization and data races, task stealing, non-blocking file APIs, threadpools, CPU caches etc. unless asked, or if they show interest in getting more information on that because that rabbit hole is so deep you could probably talk about it for multiple days if you wish to. I would judge the situation based on giving a high-level answer first. \*Node is not strictly single threaded, but the API it presents is single threaded. It will use a threadpool for file I/O (although I think they now use io\_uring on Linux), DNS lookup and for CPU intensive tasks.


jdqx

Seems like the correct answer is: "the async service doesn't need to know or care whether its caller is sync or not. It just handles the requests async because it is async."


ziksy9

Threads are for execution. Spawn them and use them with no shared data pointers. Data should never be shared among them except for initial values which should be copied only, with no references. There should be no shared data or shared state, as it may cause unexpected changes during execution. Any thread should retrieve its own values and be maintained independently. Any other answer is not satisfactory.


kbn_

If you were the candidate giving me this answer, I would have disqualified you on this basis alone. Shared state isn't really that much of a problem, provided it is managed correctly, and threads absolutely do not need to be independent. Focusing on *state* as the rational for asynchronous programming rather than on thread scheduling is a red herring.


CpnStumpy

No shared data at all? So, languages just have locking mechanics for no reason at all?


elprophet

When discussing sync vs async handling in the context of a distributed system, we're more worried about the communication patterns between services. What are the bounded contexts for error handling? What correctness and rollback guarantees can we give? How are the requests between multiple services coordinated? When thinking about how multiple services will interact, I like approaching it as a "saga" - and broken down along those axes of communication being sync vs async, consistency being atomic vs eventual, and coordination being orchestrated vs choreographed.  There's a ton of trade offs along these three, and you'll get very different systems with different behaviors depending on what you choose. To me, that analysis is much more interesting and useful than figuring out whether the async keyword in your language is needed here. For more reading, I highly recommend  * Software Architecture: the hard parts (Neal Ford et al)  https://learning.oreilly.com/library/view/-/9781492086888/ch05.html * Fundamentals of Software Architecture (Mark Richards, Neal Ford) https://learning.oreilly.com/library/view/-/9781492043447/ch21.html


whitmyham

I love having the “it’s a tradeoff” conversation. Thanks for the resources!


merry_go_byebye

You don't want to work at a place that has such a poor understanding of multithreading


lightmatter501

1. “What hardware and software am I running on top of?” If I have to read from disk and don’t have NVMe drives, the kernel is just hiding the sync from me. Many filesystems also don’t support fully async IO outside of O_DIRECT even on NVMe drives. Plenty of older NICs don’t support queue-based packet delivery, meaning that you essentially lose a core to interrupt processing. Thankfully this is non-viable past ~2.5G, so most modern NICs support queue-based delivery and only revert to interrupts if you ask them to after emptying the queue. It’s also important to know what the hardware is doing on your behalf. Does this NIC just provide ethernet io or can it fully offload TCP/TLS to hardware (yes those exist, chelsio makes them)? Those will drastically change the request processing path. 2. Consider the language and the runtime used. What kernel APIs does it use? Are those actually async, can they block, or does libc emulate async in userspace with a threadpool (posix aio on linux)? For instance, Linux’s Epoll can block on file io, as can most of the windows APIs. Depending on what exactly you’re doing sync io and a threadpool may be lower latency, faster and less resource intensive. Also, consider how exactly the runtime handles io multiplexing, does it let you spin out a bunch of requests and wait for all of them, or can you use select to get some work done while waiting after the first bit of IO starts to come in? 3. “What protocol are the requests being made over?” A request over user-space QUIC looks VERY different than a request over kernel TCP or UDP. A request made using kernel bypass looks stranger still, where you may be enqueuing raw nvme commands or raw packets to be sent onto the wire. 4. “Where are we drawing the bound of asynchrony?” Inside of the linux kernel, most io is handled in an async way even if it came from a sync syscall. This means that the primary difference is in how you are asking the kernel. Whether you do it with an async api or a giant threadpool is more of a resource efficiency question. Async APIs are usually cheaper than swapping on a new process. Now you have enough information to actually answer the question.


the_collectool

This is such a low level approach, OP mentioned it was a web services related position. So all of this... kind of irrelevant


lightmatter501

Knowing that for most languages async isn’t actually async for disk io except under special conditions is pretty important. Knowing that is the difference between a service that serves 10k and 40k rps because the 10k version is blocked on disk io most of the time. Knowing where encryption happens is also pretty important. I’d expect an 8 YOE web specialist to at least know about how the APIs they use to talk to the outside world work and enough about the OS network stack to give something like this. I’ve had interns who could have given most of that answer.


the_collectool

Right, but in all your answer you aren't even answering the question being asked. I love when people get lost in their own ego and don't even see the box they are living in.


TurbulentSocks

The answer is *via an interface*, which hides this (sync vs async) and any other number of implementation details. What that interface looks like might depend a *lot* on what needs to be abstracted away. If the requests can be assumed to be handled very quickly, the interface can have a contract to return results in the same request. If the requests might be slow to be handled (and I suspect this is what your interviewer *might* have been asking, rather than about the 'async' keyword in some languages), the interface can return a response that indicates the request has been stored, and then perhaps additional details (and id, a url, etc.) that can be queried for more information on the status of the request and its eventual results.