Skip to content

Commit

Permalink
Import from backup tarball - initial tup
Browse files Browse the repository at this point in the history
  • Loading branch information
gittup committed Feb 16, 2008
0 parents commit 2cd21d2
Show file tree
Hide file tree
Showing 16 changed files with 275 additions and 0 deletions.
13 changes: 13 additions & 0 deletions 01-initial.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
digraph g {
rankdir=LR;
maino [label="main.o"];
mainc [label="main.c"];
libo [label="lib.o"];
libc [label="lib.c"];
libh [label="lib.h"];
liba [label="lib.a"];
main -> {maino liba}
maino -> {mainc libh}
libo -> {libc libh}
liba -> libo;
};
14 changes: 14 additions & 0 deletions 02-location.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
digraph g {
rankdir=LR;
shape=box;
maino [label="main.o"];
mainc [label="main.c"];
libo [label="lib.o"];
libc [label="lib.c"];
libh [label="lib.h"];
liba [label="lib.a"];
main -> {maino liba} [color="red" label="main/Makefile"];
maino -> {mainc libh} [color="blue" label="main/main.d"];
libo -> {libc libh} [color="green" label="lib/lib.d"];
liba -> libo [color="#FF33FF" label="lib/Makefile"];
};
26 changes: 26 additions & 0 deletions 03-large.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
digraph g {
rankdir=LR;
shape=box;
maino [label="main.o"];
mainc [label="main.c"];
libo [label="lib.o"];
libc [label="lib.c"];
libh [label="lib.h"];
liba [label="lib.a"];
main -> {maino liba} [color="red" label="main/Makefile"];
maino -> {mainc libh} [color="blue" label="main/main.d"];
libo -> {libc libh} [color="green" label="lib/lib.d"];
liba -> libo [color="#FF33FF" label="lib/Makefile"];

all -> main [label="main/Makefile" color="red"];
all -> exe1 [label="exe1/Makefile" color="#BBBB33"];
all -> exe2 [label="exe2/Makefile"];
all -> dotdot [label=".../Makefile"];
all -> exeN [label="exeN/Makefile"];
exe1o [label="exe1.o"];
exe1c [label="exe1.c"];
dotdot [label="..."];
barh [label="bar.h"];
exe1 -> {exe1o liba} [color="#BBBB33" label="exe1/Makefile"];
exe1o -> {exe1c barh libh} [color="#BB8866" label="exe1/exe1.d"];
};
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
all: $(patsubst %.dot,%.png,$(wildcard *.dot))
%.png: %.dot Makefile
dot -Tpng $< > $@
16 changes: 16 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
tup - teh updater

tup is always run as 'tup'. Not 'tup FOO=bar' or 'tup -d -f7 -seven=1one1!'. This is because tup does not take arguments - tup settles them:

Dude1: I think Makefiles are the best!
Dude2: <Please wait while loading JVM>
Dude2: <...>
Dude2: <...>
Dude2: <...>
Dude2: <...>
Dude2: Ant is awesome because it's java! Java ftw!
tup: I am the best. This argument is settled.
Dudes1&2: Agreed. Let's be friends.


tup runs in a clean environment. If your environment is not clean when tup runs, it will clean it for you. Mother Nature is totally in love with tup because of this.
15 changes: 15 additions & 0 deletions cache.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Cold cache script:
#!/bin/sh
make distclean
echo 1 > /proc/sys/vm/drop_caches
echo 2 > /proc/sys/vm/drop_caches
echo 3 > /proc/sys/vm/drop_caches
make allnoconfig
time make

Warm cache script:
#!/bin/sh
make distclean
make allnoconfig
rgrep laksdflkdsaflkadsfja .
time make
2 changes: 2 additions & 0 deletions doing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
emerge -uDpv --newuse world
try to use truetype font in dot
Binary file added fase06-final.pdf
Binary file not shown.
58 changes: 58 additions & 0 deletions inotify.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#if 0
Step 1. Start at initial directory foo. Add watch.

Step 2. Setup handlers for watch created in Step 1.
Specifically, ensure that a directory created
in foo will result in a handled CREATE_SUBDIR
event.

Step 3. Read the contents of foo.

Step 4. For each subdirectory of foo read in step 3, repeat
step 1.

Step 5. For any CREATE_SUBDIR event on bar, if a watch is
not yet created on bar, repeat step 1 on bar.
#endif
#include <stdio.h>
#include <sys/inotify.h>

int main(void)
{
int fd;
int wd;
int rc = 0;
int x;
uint32_t mask;
char buf[1024];

fd = inotify_init();
if(fd < 0) {
perror("inotify_init");
return -1;
}

mask = IN_MODIFY | IN_CREATE | IN_DELETE;
wd = inotify_add_watch(fd, "/home/mjs/btmp", mask);
if(wd < 0) {
perror("inotify_add_watch");
rc = -1;
goto close_inot;
}
while((x = read(fd, buf, sizeof(buf))) > 0) {
int offset = 0;

while(offset < x) {
struct inotify_event *e = (void*)buf + offset;
printf("Received event: %x\n", e->mask);
if(e->len > 0) {
printf(" Name: %s\n", e->name);
}
offset += sizeof(*e) + e->len;
}
}

close_inot:
close(fd);
return rc;
}
1 change: 1 addition & 0 deletions java.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
http://freshmeat.net/projects/jvmd/
Binary file added mmath-thesis.pdf
Binary file not shown.
17 changes: 17 additions & 0 deletions notes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Files are "dependency objects" - timestamp comparison?

for config files (eg: linux/.config)
- dependency object compares against version copied during last build
eg: diff linux/.config build/.config
each line difference creates a new "dependency object", which may be a dependency for source files

lspci at startup
- can check against old version of lspci output
- create new dependency objects - even update .config

when parsing directory structure, allow .config objects to be applied recursively

eg: obj-$(CONFIG_PCI) += pci
pci/dep.tup -- Kernel.CONFIG_PCI

debug mode / release mode
7 changes: 7 additions & 0 deletions old.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

<p>Also note that there is a Makefile in the top-level directory. This file would typically have a target to call <em>make</em> in each of its subdirectories. The subdirectory Makefile could then build a piece of the project, and/or invoke make on its subdirectories. Because make is recursively invoking itself, this structure is known as "Recursive Make". This is generally considered a bad idea, because it leads to incorrect builds, maintenance hassles, and doesn't allow project-level build parallelism. For a more detailed analysis of why a recursive make is not a valid solution, see <a href="http://miller.emu.id.au/pmiller/books/rmch/">Recursive Make Considered Harmful</a>.

<p>The way to work around the inefficiencies and incorrect builds of recursive make is to implement a non-recursive make. A non-recursive make uses a single <em>make</em> process that has global dependency knowledge. Sub-projects can still specify their own dependencies (avoiding a single huge Makefile at the top level) by using GNU make's <em>include</em> directive. The single <em>make</em> process in the non-recursive make builds a DAG for the entire project and can use that knowledge to execute a parallel build across the whole project. It can also enforce dependencies across sibling directories that may have gotten lost or been incorrect in the recursive make setup.</p>
<p>Unfortunately, the non-recursive make, much like the recursive make, does not scale well. Consider a larger project, which now contains several other executables in addition to the main/lib combination from before:</p>

<p>After each change, <em>make</em> is run from the top-level. This Makefile iteratively includes each sub-makefile, including those in the irrelevant <em>exeN</em> directories (which, while not shown here, could be potentially large projects in of themselves). Each time <em>make</em> will read in every Makefile and every .d file in the entire project before narrowing down to the changes in main/ and lib/. A developer interested in saving time may instead try to execute make once in lib/ and once in main/, essentially keeping track of the <em>main-&gt;lib.a</em> dependency in his head. However, in this case he would be neglecting the fact that now exe1 also uses the library, so it needs to be re-built to make sure it hasn't been broken. Essentially the problem is that make needs to run at the top-level to ensure a consistent build, but for a large-scale project it is too slow to read in all the dependency information, and then <em>stat()</em> every file to determine which ones are out-of-date.</p>
52 changes: 52 additions & 0 deletions origin.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Screen dump from when I was typing around. I thought of making a distro
# called "Teh Uber Linux", then tup followed
INSTALL sound/core/snd-pcm.ko
INSTALL sound/core/snd-rawmidi.ko
INSTALL sound/core/snd-rtctimer.ko
INSTALL sound/core/snd-timer.ko
INSTALL sound/core/snd.ko
INSTALL sound/pci/ac97/snd-ac97-codec.ko
INSTALL sound/pci/ca0106/snd-ca0106.ko
INSTALL sound/pci/emu10k1/snd-emu10k1-synth.ko
INSTALL sound/pci/emu10k1/snd-emu10k1.ko
INSTALL sound/pci/emu10k1/snd-emu10k1x.ko
INSTALL sound/pci/hda/snd-hda-intel.ko
INSTALL sound/synth/emux/snd-emux-synth.ko
INSTALL sound/synth/snd-util-mem.ko
if [ -r System.map -a -x /sbin/depmod ]; then /sbin/depmod -ae -F System.map 2.6.22-gentoo-r9; fi
[root@captainfalcon linux-2.6.22-gentoo-r9]# ls
COPYING Makefile arch include mm sound
CREDITS Module.symvers block init net usr
Documentation README crypto ipc patches.txt vmlinux
Kbuild REPORTING-BUGS drivers kernel scripts
MAINTAINERS System.map fs lib security
[root@captainfalcon linux-2.6.22-gentoo-r9]# tul
bash: tul: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tul
bash: tul: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tutp
bash: tutp: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]# tup
bash: tup: command not found
[root@captainfalcon linux-2.6.22-gentoo-r9]#

51 changes: 51 additions & 0 deletions proposal.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
<html>
<head>
<title>Mike Shal - CS798 Proposal</title>
</head>
<body>
<ol>
<li><h2>Introduction</h2></li>
<p>This proposal describes the project I would like to undertake towards credit for a Masters of Science in Computer Science. The main goal of the project will be to design and implement a build system (like the UNIX <em>make</em> program, for example) that is able to scale to large sized projects. The sample projects and analysis will assume the projects are written in C and compiled with gcc, though it is hoped that the result will not be language-dependent. In the following sections I will give some background information about the history and current state of build systems, and why I believe they are insufficient for large-scale development. Then I will describe the separate components of the proposed build system. Finally, I will discuss the long-term goals of the program if it is successful.</p>

<li><h2>Background</h2></li>
<p>The need for a build tool becomes evident during the development of any project that grows to span multiple source files, libraries, and directories. The developer typically focuses on a small subset of the project, and even with just a few source files it is more efficient to re-compile only the changes than it is to re-compile the whole project. Parts of the project are dependent on other parts, however. For example, multiple source files can include the same header. A change to the header requires that each source file that includes it is re-built. Similarly, if an archive is re-built, then all programs that link to the archive must be re-linked. All of these dependencies must be tracked by the build program to ensure a consistent and correct build.</p>
<p>The common UNIX program <em>make</em> is a build tool that allows a developer to specify dependencies, and commands to rebuild targets. When executed, <em>make</em> will perform two main functions: 1) construct a DAG (directed acyclic graph) of all dependencies, and 2) traverse the graph starting at the requested target, rebuilding only those targets that are out of date with respect to their dependencies. In this way <em>make</em> can be used to build an entire project from scratch, or build only the portions of the project that have been affected by changes.</p>
<p>Unfortunately, <em>make</em> does not scale well for large projects when building only the parts of the project that are affected by changes. Specifically, the update operation is at best an O(n) algorithm, where <em>n</em> is the number of dependencies. This is undesirable, because unrelated pieces of the project affect the build time of the part of the project we may be working on. This behavior is not specific to <em>make</em>. There are several complaints (some legitimate, some not) with <em>make</em> or its Makefiles, such as the fact that: Makefiles are their own sort of language (as opposed to something already known like Perl or Python), automatic dependency handling is not builtin to the program, or it is difficult maintaining a project across multiple directories. As a result, a number of alternatives to <em>make</em> have been created. For example, <a href="http://www.dsmit.com/cons/">CONS</a>, <a href="http://www.scons.org/">SCONS</a>, <a href="http://www.perforce.com/jam/jam.html">JAM</a>, <a href="http://www.a-a-p.org/">A-A-P</a>, and <a href="http://omake.metaprl.org/index.html">OMake</a>, among others. All of these programs suffer from the same linear update time. Ideally, the time to process an update would be proportional to the amount of changes required. The current linear behavior is actually a result of three separate factors:
<ol>
<li><em>make</em> always reads the entire DAG before rebuilding anything.</li>
<li><em>make</em> does not know which files have been updated before-hand; instead, it considers each target and checks to see if it is out-of-date with respect to its dependencies.</li>
<li>The storage of the dependencies in the filesystem (as generated by gcc's <em>-MD</em> option, or older equivalents such as <em>makedepend</em>) makes it necessary that any program must read every dependency file before updating.</li>
</ol>

<p>We will now consider each of these factors in more detail. First, <em>make</em> always reads in the entire DAG before updating anything. Since each edge must be added to the DAG, it is easy to see that even if we ignore any other processing, constructing the DAG is at best a linear operation. As such, any build program that could hope to improve on a linear time algorithm must not rely on the entire DAG. Instead, it should construct only the portion of the graph that it needs based on the changes to the system. This is difficult because of the second and third factors mentioned above.</p>
<p>When performing an update, the <em>make</em> program essentially starts with a single target (such as 'all'), and asks the question <i>Do I need to update this target?</i>. This question can only be answered by checking the timestamps of each of its dependencies, and each of their dependencies, and so on through the DAG. (Other build programs may use MD5 sums or other hashes instead of timestamps, but this is irrelevant to the complexity of the algorithm). Again, this is at best a linear operation. While developing in a project, however, we don't really care if 'all' is updated or not. What we care about is that anything dependent on the immediate changes we have made (such as a .c file we modified) is updated. A better question to ask is: <i>What files need to be updated given that these files have changed?</i> Answering this question assumes we had a list of files that were changed up front. This is not currently provided to the build program.</p>
<p>Even if we had a list of files that were changed up front, any build tool is again limited to a linear-time algorithm because of the way the dependencies are structured in the filesystem. Consider the following minimal example:</p>

<pre>
Makefile
main/
main.c
Makefile
lib/
lib.c
lib.h
Makefile
</pre>
<p>Such a program may have the following dependency information:</p>
<table border=1><tr><td><img src="01-initial.png"></td></tr></table>

<p>The dependencies are actually stored in several different places. The two dependencies on the header file are output by gcc (using the <em>-MD</em> or similar option) the first time the program is built. This information can be used on subsequent builds to re-build both main.o and lib.o if the header changes. The edges from the .o to the .c files are generally written in the Makefiles as implicit rules (such as <em>%.o: %.c</em>). These dependencies are also written by gcc in the .d file. The following graph shows the same program along with where the actual edges are found:</p>
<table border=1><tr><td><img src="02-location.png"></td></tr></table>

<p>Dependencies that are from the same file are shown in the same color. This will be explained later. For now, consider we are developing the interaction between the main program and the library. This will likely include changing the lib.h file. When this file is changed, both lib.o and main.o must be rebuilt, the archive must be re-created, and ultimately the main executable must be linked. Now let's assume that this is actually a small part of a much larger project:</p>

<table border=1><tr><td><img src="03-large.png"></td></tr></table>

<p>Suppose we are still only developing the interaction between the main program and the library. However, now any changes we make to the library must also cause a rebuild of another (or possibly multiple) other binaries. In order to determine what pieces must be rebuilt, the build tool must read in all of the dependencies and find the incident edges on lib.h. Notice how the incident edges to the lib.h node are all separate colors. This indicates they are all stored in separate files. So if we try to answer the basic question <i>What files must be updated given that lib.h has changed?</i>, the program must necessarily read in every dependency file, since we have no way of knowing which ones might contain an edge to lib.h. This means if we specify dependencies in this manner, no matter what program we use, we will be forced to use at best a linear update algorithm. The consequence of this fact is that any build program that relies on the output of gcc's dependency mechanism can perform no better than a linear-time update.</p>


<li><h2>Goals</h2></li>

<li><h2>Long-term Goals</h2></li>
</body>
</html>
Binary file added thesis-approval.pdf
Binary file not shown.

0 comments on commit 2cd21d2

Please sign in to comment.