logoalt Hacker News

mrkeentoday at 3:47 PM4 repliesview on HN

> fork() is a relatively expensive system call; it must copy the entire process state (including memory) for the child process. Many optimizations have been made over the years, but a fork is still a fundamentally costly operation. To make things worse, a fork() call is often immediately followed by an exec(), which will discard all of that memory that was so carefully copied for the child.

It's weird to leave out a mention of copy-on-write - the optimisation that means that you don't copy over all the memory.


Replies

tux3today at 3:58 PM

This was left implicit in the article, but what they mean by copying the process state here is the memory management structures. That's mainly the page tables and the VMAs.

That means you have to allocate new pages to hold a copy of all these structures, even if the actual memory pointed by the pages is shared. And walking all those structures to make a copy is still costly.

cls59today at 4:24 PM

Even with copy-on-write, fork() still has to pay the setup cost for COW. If the parent process has a lot of busy threads (e.g. Java), you can end up doing a lot of unnecessary COW before exec() fires.

epcoatoday at 4:34 PM

> It's weird to leave out a mention of copy-on-write

For the intended audience of such a paper this is base knowledge.

FooBarWidgettoday at 3:57 PM

It says state. Copy on write still means it's O(number of page table entries) even if you don't copy the contents. It's a well known issue that forking a program with large virtual memory size is slow.

show 2 replies