stolen from here
Google Interview Questions
Some time back I interviewed with Google, and a number of other well known software development companies. I've written up a number of questions very similar to the ones I was asked, changing them enough so that I'm not breaking my NDA.Q: Why are manhole covers round?
Q: A man pushed his car to a hotel and lost his fortune. What happened?
Q: Explain the significance of "dead beef".
Q: Write a C program which measures the the speed of a context switch on a UNIX/Linux system.
Q: Given a function which produces a random integer in the range 1 to 5, write a function which produces a random integer in the range 1 to 7.
Q: Describe the algorithm for a depth-first graph traversal.
Q: Design a class library for writing card games.
Q: You need to check that your friend, Bob, has your correct phone number, but you cannot ask him directly. You must write a the question on a card which and give it to Eve who will take the card to Bob and return the answer to you. What must you write on the card, besides the question, to ensure Bob can encode the message so that Eve cannot read your phone number?
Q: How are cookies passed in the HTTP protocol?
Q: What is the size of the C structure below on a 32-bit system? On a 64-bit?
struct foo { char a; char* b; };
Q: Design the SQL database tables for a car rental database.
Q: Write a regular expression which matches a email address.
Q: Write a function f(a, b) which takes two character string arguments and returns a string containing only the characters found in both strings in the order of a. Write a version which is order N-squared and one which is order N.
Q: You are given a the source to a application which is crashing when run. After running it 10 times in a debugger, you find it never crashes in the same place. The application is single threaded, and uses only the C standard library. What programming errors could be causing this crash? How would you test each one?
Q: Explain how congestion control works in the TCP protocol.
Here's one that someone sent me in email. I've outlined a possible solution, but I haven't thought through the problem very well. Here's the question:
Design and describe a system/application that will most efficiently produce a report of the top 1 million Google search requests. You are given:
- You are given 12 servers to work with. They are all dual-processor machines with 4Gb of RAM, 4x400GB hard drives and networked together.(Basically, nothing more than high-end PC's)
- The log data has already been cleaned for you. It consists of 100 Billion log lines, broken down into 12 320 GB files of 40-byte search terms per line.
- You can use only custom written applications or available free open-source software.
1) go through the log, and count the occurances of search terms found using a binary tree for search term lookups, and a long-int for the counter. Do this unil you have 54 million unique terms in your tree. Write a marker into the log file for each search term used so that further passes know it has been counted.
2) transmit the tree over the network to the next computer and count only the search term occurances which are already in the tree, marking any counted in the logs as such; then transmit the tree to the next and repeat this process until the tree has passed through each computer and ends up at the computer WHERE THE FIRST UNQIUE SEARCH TERM WAS ADDED TO THE TREE(more on this in the next step). Heap-sort the tree by search term occurance number (why not, it's already in a binary tree -- that's great for heap-sort!). Truncate the tree to 1 million search terms, and write it to disk.
3) Every time a tree of unique search terms has "filled up", that is, the tree contains 54 million unique search terms, a new unique-search term tree must be built and sent to each computer so that the terms can be counted. A tree might begin near the end of a log on one computer, and be sent to the next computer with less than the 54 million unique search terms. When this is the case, the tree will continue to collect unique search terms from the log on the next computer.
There are two processess occuring in this algorithm which travel in rings around the computer network until they reach the computer at which they began. The first process travels from computer to computer generating these unqiue search term trees (call this process #1). Once a tree fills up with unique terms, it breaks off from this process and is sent around the ring to count the search term occurances for terms which it already contains (call these processes Tn, there can be many). Once process #1 reaches the computer where it started, it quits. Once all the Tn processes have traveled the ring, they quit also after writing their top 1 million search results to disk in sorted order. The search terms in these files are all unique and accurately counted. Just start popping search terms off the top of these logs by the highest occurance count until you have 1 million search terms. That should be it
No comments:
Post a Comment