OLAP vs OLTP

OLAP

Online analytical processing (OLAP) is a system for performing multi-dimensional analysis at high speeds on large volumes of data. Typically, this data is from a data warehouse, data mart or some other centralized data store. OLAP is ideal for data mining, business intelligence and complex analytical calculations, as well as business reporting functions like financial analysis, budgeting and sales forecasting.

The core of most OLAP databases is the OLAP cube, which allows you to quickly query, report on and analyze multidimensional data. What’s a data dimension? It’s simply one element of a particular dataset. For example, sales figures might have several dimensions related to region, time of year, product models and more.

The OLAP cube extends the row-by-column format of a traditional relational database schema and adds layers for other data dimensions. For example, while the top layer of the cube might organize sales by region, data analysts can also “drill-down” into layers for sales by state/province, city and/or specific stores. This historical, aggregated data for OLAP is usually stored in a star schema or snowflake schema.

The following graphic shows the OLAP cube for sales data in multiple dimensions — by region, by quarter and by product:
image

OLTP

Online transactional processing (OLTP) enables the real-time execution of large numbers of database transactions by large numbers of people, typically over the Internet. OLTP systems are behind many of our everyday transactions, from ATMs to in-store purchases to hotel reservations. OLTP can also drive non-financial transactions, including password changes and text messages.

OLTP systems use a relational database that can do the following:

  • Process a large number of relatively simple transactions — usually insertions, updates and deletions to data.
  • Enable multi-user access to the same data, while ensuring data integrity.
  • Support very rapid processing, with response times measured in milliseconds.
  • Provide indexed data sets for rapid searching, retrieval and querying.
  • Be available 24/7/365, with constant incremental backups.

Many organizations use OLTP systems to provide data for OLAP. In other words, a combination of both OLTP and OLAP are essential in our data-driven world.

The main difference between OLAP and OLTP: Processing type

The main distinction between the two systems is in their names: analytical vs. transactional. Each system is optimized for that type of processing.

OLAP is optimized for conducting complex data analysis for smarter decision-making. OLAP systems are designed for use by data scientists, business analysts and knowledge workers, and they support business intelligence (BI), data mining and other decision support applications.

OLTP, on the other hand, is optimized for processing a massive number of transactions. OLTP systems are designed for use by frontline workers (e.g., cashiers, bank tellers, hotel desk clerks) or for customer self-service applications (e.g., online banking, e-commerce, travel reservations).

Other key differences between OLAP and OLTP
  • Focus: OLAP systems allow you to extract data for complex analysis. To drive business decisions, the queries often involve large numbers of records. In contrast, OLTP systems are ideal for making simple updates, insertions and deletions in databases. The queries typically involve just one or a few records.
  • Data source: An OLAP database has a multi-dimensional schema, so it can support complex queries of multiple data facts from current and historical data. Different OLTP databases can be the source of aggregated data for OLAP, and they may be organized as a data warehouse. OLTP, on the other hand, uses a traditional DBMS to accommodate a large volume of real-time transactions.
  • Processing time: In OLAP, response times are orders of magnitude slower than OLTP. Workloads are read-intensive, involving enormous data sets. For OLTP transactions and responses, every millisecond counts. Workloads involve simple read and write operations via SQL (structured query language), requiring less time and less storage space.
  • Availability: Since they don’t modify current data, OLAP systems can be backed up less frequently. However, OLTP systems modify data frequently, since this is the nature of transactional processing. They require frequent or concurrent backups to help maintain data integrity.

source

Entity Framework AsNoTracking

Entity Framework exposes a number of performance tuning options to help you optimise the performance of your applications. One of these tuning options is .AsNoTracking(). This optimisation allows you to tell Entity Framework not to track the results of a query. This means that Entity Framework performs no additional processing or storage of the entities which are returned by the query. However, it also means that you can’t update these entities without reattaching them to the tracking graph.

source

Weaving .net assemblies

Weaving refers to the process of injecting functionality into an existing program. This can be done conceptually at a number of levels:

  • Source code weaving would inject source code lines before the code is compiled
  • IL weaving (for .NET) adds the code as IL instructions in the assembly
  • ByteCode weaving (for Java) works on the class file, see these comments wrt AspectJ

In theory you could go one deeper and weave with an executable compiled to native instructions, but this would add a lot of complexity and I’m not aware of anything that does this.

source

HTTP Logging options

HTTP Logging is a middleware that logs information about HTTP requests and HTTP responses. HTTP logging provides logs of:

HTTP request information
Common properties
Headers
Body
HTTP response information
HTTP Logging is valuable in several scenarios to:

Record information about incoming requests and responses.
Filter which parts of the request and response are logged.
Filtering which headers to log.
HTTP Logging can reduce the performance of an app, especially when logging the request and response bodies. Consider the performance impact when selecting fields to log. Test the performance impact of the selected logging properties.

source

Nginx upstream sent too big header

Fix when Nginx is running in a proxy / reverse proxy mode

Edit your nginx.conf or virtual domain file:

1
$ sudo emacs /etc/nginx/sites-available/site.conf

Set the following in http or server or location contaxt:

server {
 proxy_busy_buffers_size   512k;
 proxy_buffers   4 512k;
 proxy_buffer_size   256k;
 # rest of nginx config #
}

Sidecar pattern

Deploy components of an application into a separate process or container to provide isolation and encapsulation. This pattern can also enable applications to be composed of heterogeneous components and technologies.

This pattern is named Sidecar because it resembles a sidecar attached to a motorcycle. In the pattern, the sidecar is attached to a parent application and provides supporting features for the application. The sidecar also shares the same lifecycle as the parent application, being created and retired alongside the parent. The sidecar pattern is sometimes referred to as the sidekick pattern and is a decomposition pattern.

A sidecar service is not necessarily part of the application, but is connected to it. It goes wherever the parent application goes. Sidecars are supporting processes or services that are deployed with the primary application. On a motorcycle, the sidecar is attached to one motorcycle, and each motorcycle can have its own sidecar. In the same way, a sidecar service shares the fate of its parent application. For each instance of the application, an instance of the sidecar is deployed and hosted alongside it.

Advantages of using a sidecar pattern include:

A sidecar is independent from its primary application in terms of runtime environment and programming language, so you don’t need to develop one sidecar per language.

The sidecar can access the same resources as the primary application. For example, a sidecar can monitor system resources used by both the sidecar and the primary application.

Because of its proximity to the primary application, there’s no significant latency when communicating between them.

Even for applications that don’t provide an extensibility mechanism, you can use a sidecar to extend functionality by attaching it as its own process in the same host or sub-container as the primary application.

The sidecar pattern is often used with containers and referred to as a sidecar container or sidekick container.

image

source

dapr

Dapr is a portable, serverless, event-driven runtime that makes it easy for developers to build resilient, stateless and stateful microservices that run on the cloud and edge and embraces the diversity of languages and developer frameworks.

Dapr codifies the best practices for building microservice applications into open, independent, building blocks that enable you to build portable applications with the language and framework of your choice. Each building block is independent and you can use one, some, or all of them in your application.

Dapr consists of a rich set of buildingblocks that will help developers in building microservices. Applications that make use of Dapr can implement one or multiple buildingblocks; so it’s not required to implement them all.

A building block is an abstract implementation of a certain pice of functionality that can be configured centrally. For example, your application could implement the Dapr publish/subscribe building block and centrally is configured that publish/subscribe is done using RabbitMQ. If you later decide to stop using RabbitMQ and go for Azure Service Bus, you only have to update the central configuration. The code of your microservices doesn’t require any changes.

image

source

Zipkin

Zipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in service architectures. Features include both the collection and lookup of this data.

If you have a trace ID in a log file, you can jump directly to it. Otherwise, you can query based on attributes such as service, operation name, tags and duration. Some interesting data will be summarized for you, such as the percentage of time spent in a service, and whether or not operations failed.
image
source

Refit type-safe REST library

Refit is a library heavily inspired by Square’s Retrofit library, and it turns your REST API into a live interface:

public interface IGitHubApi
{
    [Get("/users/{user}")]
    Task<User> GetUser(string user);
}

The RestService class generates an implementation of IGitHubApi that uses HttpClient to make its calls:

var gitHubApi = RestService.For<IGitHubApi>("https://api.github.com");
var octocat = await gitHubApi.GetUser("octocat");

.NET Core supports registering via HttpClientFactory

services
    .AddRefitClient<IGitHubApi>()
    .ConfigureHttpClient(c => c.BaseAddress = new Uri("https://api.github.com"));

source

C# Record Types

C# 9 introduces records, a new reference type that you can create instead of classes or structs. C# 10 adds record structs so that you can define records as value types. Records are distinct from classes in that record types use value-based equality. Two variables of a record type are equal if the record type definitions are identical, and if for every field, the values in both records are equal. Two variables of a class type are equal if the objects referred to are the same class type and the variables refer to the same object. Value-based equality implies other capabilities you’ll probably want in record types. The compiler generates many of those members when you declare a record instead of a class. The compiler generates those same methods for record struct types.
source