Welcome to Node.js

Node is a platform built on Chrome’s JS runtime to build fast, scalable network application.  Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices.
- Node uses V8, the virtual machine that powers Google Chrome.
- Node provides an event-driven (using event loop) and non-blocking (using asynchronous I/O) platform for server-side JavaScript, in the same way as it does in the browser.

In the browser, a program performs an HTTP request for resource.json.  When the response comes back, an anonymous function is called (the “callback” in this context) containing the argument data, which is the data received from the request.

$.post(‘/resource.json’, function (data) {
  console.log(data);
});
// script execution continues

When I/O happens in the browser, it happens outside of the event loop (outside the main script execution) and then an “event” is emitted when the I/O is finished, which is handled by a function (the "callback").

A server is usually blocking, relying on multithreaded.  Node however using the same model as JS in the browser.  Note, similar idea with NGINX compared to Apache.

This makes Node a good solution for data-intensive real-time applications (DIRTy applications).  An example of non-blocking IO in node, would be using filesystem (fs) module to load resource from disk:

var fs = require(‘fs’);  
fs.readFile(‘resource.json’, function(er, data){  
  console.log(data);
});
console.log(‘Doing something else’);  

In this example, we read resource.json file from disk.  When all the data is read, an anonymous function is called (aka the “callback”) containing the arguments er, if any error occurred, and data, which is the file data. 

Similarly:

var callback = function(err, contents){  
  console.log(contents);
}
fs.readFile(‘/etc/hosts’, callback);  

Streaming

Node is huge on streaming.  Streams can be thought of data distributed over time.  By bringing in data chuck by chunk, data can be handled as it comes in instead of waiting for it all to arrive before acting.

var stream = fs.createReadStream(‘./resource.json’);  
stream.on(‘data’, function (chunk) {  
  console.log(chunk)
})
stream.on(‘end’, function () {  
  console.log(‘finished’)
})

Data event is fired whenever a new chunk is ready, and end event is fired when all the chunks have been loaded.  Node also provides writable streams that you can write chunks of data to.  Readable and writable streams can be connected to make pipes.  This provides efficient way to write out data as soon as its ready, without waiting for the complete resource to be written out.

Piping example:

var http = require(‘http’);  
var fs = require(‘fs);  
http.createServer(function (req, res) {  
  res.writeHead(200, {‘Content-Type’: ‘image/png’});
  fs.createReadStream(‘./image.png’).pipe(res);
}).listen(3000);
console.log(‘Server running at http://localhost:3000/');  

The underlying asynchronous IO library (libuv) was built to be ported across devices.

Real-time Chat Application

Application shows how Node can simultaneously serve conventional HTTP data (like static files) and real-time data (like chat messages).

Node simultaneously handles HTTP and WebSocket using a single TCP/IP port.

Node Programming Fundamentals

Modules

Modules are Node’s way of keeping code organised and packaged for easy reuse.  Generally, would group related logic and move into separate files.  In some language implementations (e.g. PHP and Ruby) incorporating the logic from another file (the “included” file) can import the global scope from the included file.  PHP uses namespaces and Ruby uses modules to overcome this.

Node modules bundle up code for reuse but don’t alter the global scope.  Node modules allow you to select what functions and variables from the included file are exposed to the application.  If the module is returning more than one function or variable, module can specify these with object called exports.  If the module is returning a single function or variable the property module .exports can instead be set.  Modules can then be published to the npm (Node Package Manager).

Modules can either be single files or directories.  If a directory, file is usually index.js.  To create a module, you create a file that defines properties on the exports object with any kind of data, such as strings, objects and functions.  To utilise a module, use require which takes the path to the module as an argument.  Omit the .js extension because it is assumed.

require is synchronous IO, so avoid using require in IO intensive parts of your application.  If running a HTTP server, would take a performance hit if used require on each incoming request.  Typically require and other synchronous operations only used when the application loads initially.

var canadianDollar = 0.91;  
function roundTwoDecimals(amount) {  
  return Math.round(amount * 100) / 100;
}
exports.canadianToUS = function(canadian) {  
  return roundTwoDecimals(canadian * conadianDollar);
}
exports.USToCanadian = function(us) {  
  return roundTwoDecimals(us / canadianDollar);
}

// Path uses ./ to indicate the module exists within the same directory as application script:  
var currency = require(‘./currency’);  
console.log(currency.canadianToUS(50));  // Use currency module’s canadianToUS function  
console.log(currency.USToCanadian(30)); // Use currency module’s USTOCanadian function  

Could also use a single constructor function rather than an object containing functions:

var Currency = require(‘./currency’);  
var canadianDollar = 0.91;  
var currency = new Currency(candianDollar);  
console.log(currency.canadianToUS(50));  

module.exports mechanism enables you to export a single variable, function or object.

One-off events with callbacks (asynchronous programming)

Callbacks generally define logic for one-off responses.  E.g. if perform a database query can specify callback to determine what to do with the query results.

A callback is a function, passed as an argument to an asynchronous function, that describes what to do after the asynchronous operation has completed.

The example nests three levels of callbacks.  Three levels isn’t bad, but the more levels of callbacks the more cluttered the code and the harder to refactor and test.  By creating named functions that handle the individual levels of callback nesting, can express the same logic in a way that requires more lines of code but could be easier to maintain, test and refactor.

Can also reduce nesting caused by if/else blocks by returning early from a function.

Node convention for asynchronous callbacks: built-in modules use callbacks with two arguments: error (er or err) and results (for example data).

Handling repeating events with event emitters (asynchronous programming)

Event emitters fire events and include the ability to handle those events when triggered.  Common Node API components HTTP servers, TCP servers and streams are implemented as event emitters.

Event are handled through the use of listeners.  A listener is a callback associated with an event that gets triggered each time the event occurs.  For example, in Node a HTTP server emits a request event when an HTTP request is made You can listen for that request event to occur and add some response logic:

server.on(‘request’, handleRequest);  
socket.on(‘data’, handleData);  

Examples:

var net = require (‘net’);  
var server = net.createServer(function(socket) {  
  // data events handled whenever new data has been read
  // socket.once will only handle the data event once
  socket.on(‘data’, function(data) {
    // Data is written (echoed back) to client
    socket.write(data);
  });
});  
server.listen(8888);  
var EventEmmiter = require(‘events’).EventEmitter;  
var channel = new EventEmitter();  
channel.on(‘join’, function() {  
  console.log(`Welcome!`);
});
// To call the join callback need to emit events.  This line triggers an event using emit function
channel.emit(‘join’);  

Node’s event loop keeps track of asynchronous logic that hasn’t been completed.  As long as there’s uncompleted asynchronous logic the Node process won’t exit.  A continually running Node application may be desirable for a web server, but not for a command line tool.

Can use closures to keep argument for anonymous function local in scope.

Sequencing asynchronous logic (flow control)

The concept of sequencing groups of asynchronous tasks is called flow control.  There are two types of flow control: serial; and parallel.  Serial tasks execute in sequence while parallel tasks don’t have to.

To execute a number of async tasks in sequence could use callbacks.  But if you have a significant number of tasks will have to organise them.  If don’t, end up with excessive callback nesting.

Could use Nimble: you provide an array of functions for Nimble to execute one after the other.

Serial flow control

To execute async tasks in sequence using serial flow control, need to put the tasks in an array in order of desired execution.  Each tasks exists in the array as a function.  When a task has completed, the task should call a handler function to indicate error status and results.

Request module is a simplified HTTP client that can use to fetch RSS data.  The htmlparser module has functionality to turn RSS data into JavaScript structures.

Example of serial flow control: random RSS generator.

Parallel flow control

Async tasks are placed in an array, but the order is unimportant.  Each task should call a handler function that will increment the number of completed tasks.

Community flow control modules

Nimble, Step and Seq.

Building Node Web Applications

HTTP Server fundamentals

To create HTTP server, call http.createServer() function.  It accepts a single argument, a callback function, that will be called on each HTTP request received by the server.  The request callback receives as arguments, the request and response objects, which are commonly shortened to req and res:

var http = require(‘http’);  
var server = http.createServer(function(req, res){  
  res.write(‘Hello World’); // Handle request: this can be simplified: res.end(‘Hello World’);
  res.end();
});
server.listen(3000); // Bind a port to listen to incoming requests to  

Node offers several methods to progressively alter the header fields of an HTTP response, up to the first res.write() or res.end():
res.setHeader(field, value)
res.getHeader(field)
res.removeHeader(field)
res.statusCode = 404; // common to send back 404 Not Found status code when requested resource does not exist

RESTful web service

HTTP verbs GET, POST, PUT and DELETE by convention map to retrieving, creating, updating and removing resources specified by the URL. cURL is a powerful command-line HTTP client that can be used to send requests to a target server.

Node provides REPL (read-eval-print-loop) interface, available by running node from the command line without any arguments.

Serving static files

Each static file server has a root directory.  The server will define a root variable to act as the static file server’s root directory.

fs.ReadStream can be used to stream the file.

Read write streams

Node can route data from its source to its destination by adding a pipe to connect the two.  The data source is called a ReadableStream that can “pipe” to some destination WritableStream.  Streams start to process data immediately, piece by piece.

The plumbing is hooked up with the pipe method:
ReadableStream.pipe(WriteableStream);

An example of using pipes is reading a file (ReadableStream) and writing its contents to another file (WritableStream):

var readStream = fs.createReadStream(‘./original.txt’);  
var writeStream = fs.createWriteStream(‘./copy.txt’);  
readStream.pipe(WriteStream);  

Any ReadableStream can be piped into any WriteableStream. For example, an HTTP request (req) object is a ReadableStream and can stream ins contents to a file:
req.pipe(fs.createWriteStream(‘./req-body.txt’));

See the stream-handbook.

Accepting User input from forms and file uploads

Typically two Content-Type values are associated with form submission requests:
- application/x-www-form-urlencoded: the default for HTML forms
- multipart/form-data: used when the forms contains files or non-ASCII or binary data

Node community module for multipart uploads is formidable (via startup Transloadit).  Formidable is a streaming parser: it can accept chunks of data as they arrive, parse them, and emit specific parts, such as the part headers and bodies.  The lack of buffering prevents memory bloat even for large files such as videos.

Storing Node Application Data

Data considerations:
- what data is begin stored
- how quickly data needs to be read and written to maintain adequate performance
- how much data exists
- how data needs to be queried
- how long and reliably the data needs to be stored

Storing data in server memory:
- this maximises performance, but its less reliably persistent because data will be lost if the application restarts or the server loses power.

Interfacing with database management system (DBMS):
- long-term persistence of complex structured data, along with search facilities but at performance costs.

Serverless data storage

In-memory storage: in-memory storage uses variables to store data.  Read / writing this data is fast, but you’ll lose the data during server and application restarts.

File-based storage: file-based storage uses a filesystem to store data.  This type of storage is often used for application configuration information.  Allows you to persist data that can survive application and server restarts.  For a multiuser application, DBMS are more sensible because they’re designed to deal with concurrency issues (users load the same file at the same time and modify).

Relational database management systems

Traditionally used for high-end applications such as content management, customer relationship management and shopping carts.  They require specialised administration knowledge and access to a database server.  They require knowledge of SQL, although there are object-relational mappers (ORMs) with APIs that write SQL in the background.

MySQL / PostgreSQL: good for building reports with the data using SQL queries.

NoSQL database systems

Redis and MongoDB / Mongoose

Connect Middleware

Middleware can be used to sequence operation of the application.  Focus on small configurable pieces when building middleware: tiny, modular and reusable middleware that together collectively make up the application.

For example:
- only showing content after a user has authenticated;
- configuring an application router

Connect's Built in Middleware

Common implementations for web applications like:
- session management;
- static file serving;
- outgoing data compression.

Express

Express builds upon Connect to provide higher-level web framework.
npm install -g express // Install express globally
express —help // provides options available
express —flag [directory] // generate the express boilerplate

Express has minimalistic environment-driven configuration system driven by NODE_ENV environment variable, consisting of five methods:
- app.configure(), app.set(), app.get(), app.enable(), app.disable().

Views

Express provides two ways to render views: at the application level with app.render() and at the request response level with res.render().

_dirname is a global variable in Node that identifies the directory in which the currently running file _exists.

The view lookup process is similar to Node’s require(). When res.render() or app.render() is invoked, Express will first check whether a file exists at an absolute path.  Next, Express will look relative to the views directory as configured.  Finally, Express will try an index file.

Advanced Express

Authentication
Pagination
RESTful APIs

Assert: core Node testing module