Dan's .NET Blog: 2016

Sunday, August 7, 2016

Handling Whitespace

In this post I want to offer some tips for a very simply issue, but one that can cause you a lot of headaches if you don’t deal with it.

Let’s say we are validating user input and we need to be sure the user has entered something in a specific field. This data may be coming from a form, or XML file, etc., so for demonstration purposes I will just use string.

  string s1 = "";
  string s2 = null;
  string s3 = " ";
  string s4 = "\t";

  if (s1 == "") Console.WriteLine("s1 not entered");
  if (s2 == "") Console.WriteLine("s2 not entered");
  if (s3 == "") Console.WriteLine("s3 not entered");
  if (s4 == "") Console.WriteLine("s4 not entered");

If I am checking for user input in a field I would want all four of these to be considered invalid, but if I run the program, only the first one will be found, the other three will look ok. This happens because we are comparing against a literal empty string which is not the same as a null, a space, or a tab. Here is another way to do this check...

 
  if (string.IsNullOrEmpty(s1)) Console.WriteLine("s1 not entered");
  if (string.IsNullOrEmpty(s2)) Console.WriteLine("s2 not entered");
  if (string.IsNullOrEmpty(s3)) Console.WriteLine("s3 not entered");
  if (string.IsNullOrEmpty(s4)) Console.WriteLine("s4 not entered");

The string class provides a function call IsNullOrEmpty that can help with the second case. This function will return true if the the string is empty or it’s a null. This is very useful when you are about to call a function on a string, for example SubString(), because this would throw an exception if the string is null. This is a step in the right direction, but it still doesn’t help with the last two cases since a string with a space or tab aren’t considered empty. Let’s look at another function on the string class:

 
if (string.IsNullOrWhiteSpace(s1)) Console.WriteLine("s1 not entered");
if (string.IsNullOrWhiteSpace(s2)) Console.WriteLine("s2 not entered");
if (string.IsNullOrWhiteSpace(s3)) Console.WriteLine("s3 not entered");
if (string.IsNullOrWhiteSpace(s4)) Console.WriteLine("s4 not entered");

If you run this code it will recognize all four strings as invalid. IsNullOrWhiteSpace works similar to IsNullOrEmpty but it also looks for any combination of white space, spaces, tabs, carriage returns, etc.
It’s a good idea to avoid comparing a string to and empty string, and better to use one of these functions instead.
Even when you are not just looking for whitespace or not it can give you trouble. Let’s look at this example:

  
string s1 = "OK";
string s2 = " OK ";
string s3 = "\tOK\t";

if (s1 == "OK") Console.WriteLine("s1 is OK");
if (s2 == "OK") Console.WriteLine("s2 is OK");
if (s3 == "OK") Console.WriteLine("s3 is OK");

If you run this, only the first case if found to be OK because the others two have whitespace before and after the string. We can solve this by using the Trim function:

  
if (!string.IsNullOrEmpty(s1) && s1.Trim() == "OK") Console.WriteLine("s1 not entered");
if (!string.IsNullOrEmpty(s2) && s2.Trim() == "OK") Console.WriteLine("s2 not entered");
if (!string.IsNullOrEmpty(s3) && s3.Trim() == "OK") Console.WriteLine("s3 not entered");

Since Trim is an instance method we need to first be sure s1 isn’t null so I use the IsNullOrEmpty function for this. Now that we know the string isn’t null we can call Trim() on it which removes any leading or trailing whitespace. Now all three of these will find OK.

Thursday, May 19, 2016

ASP.NET Core 1 - Blank Template Part 2

In my last post I walked through the code of the blank template that is created for an ASP.NET Core 1 RC1 application. In this post we will take a closer look at the how this application works.

First let’s run the application by clicking the IIS Express button in Visual Studio.

Your browser should open and display the message “Hello World!”. If you look in your system tray you will find IIS Express serving up WebApplication1 so you application is running on IIS. If you now go into your browser’s debugging tools (for example, press F12 in Internet Explorer), go to network monitor and refresh the page you will be able to view response header:

You can see that even though we are running on IIS, the web server is reported as Kestrel since that is the actual server we are running on, IIS is just serving as a reverse proxy passing the requests to Kestrel.

Next click the down arrow next to IIS Express:

In the list below IIS Express you will see all the commands that have been declared in the project.json file. In the blank template there is only one command “web”. Click on web to start the application using this command. This time instead of a web browser opening you will see a window like this:

In this case we are running directly on Kestrel instead of Kestrel running behind IIS. To view the output of the application use the URL that is being listened on shown in the window, in this case http://localhost:5000. You will not see IIS Express in the system tray and if you look at the network monitor it will still show the server as Kestrel.

Now let’s look at the IISPlatformHandler. This piece of middleware allows Windows Integrated Authentication and TLS Authentication to pass through to Kestrel. In Startup.cs change the line that displays “Hello World” to this:

await context.Response.WriteAsync("Hello " + context.User.Identity.Name);

Just like previous versions of ASP.NET you can use the User property of the HttpContext to get authentication information. To run your app in IISExpress using Windows Authentication go to the project properties, go to the Debug tab, make sure the IISExpress profile is selected and then uncheck Enable Anonymous Authentication, and check Enable Windows Authentication.

If you run this using IISExpress you will see your account name if you are on a Windows domain, or you will be prompted to enter your username and password and again the name will be displayed. In this scenario the IISPlatformHandler is handling the authentication and passing their information along to Kestrel in the context.

Now run the application use the web command. When you browse to the page this time you will just get “Hello”. In this scenario we are connecting directly to Kestrel and not going through IISExpress so the IISPlatformHandler doesn’t have an effect.

With the web page displayed try adding a file to the end of the URL, for example http://localhost:63694/test.html. When your browse to this URL the page will still return “Hello”. How about adding a path, http://localhost:63694/test/test.html. Again you will only get “Hello” as a response. As I mentioned in my introductory article ASP.NET Core 1 applications have no functionality by default, not even the ability to server files. Every URL we use just runs the same middleware to display the hello message. In the a future post I will show how to enable various pieces of stock middleware to enable these features.

I hope the demonstrations in this article give you a better idea of some of the differences between the new ASP.NET and previous versions.

Sunday, May 8, 2016

ASP.NET Core 1–Blank Template

In my last post I talked about some of the basics of ASP.NET Core 1, in this posted we will get setup to do some development and take a look at some code. I am going to be working on Windows using Visual Studio but the concepts still apply to doing ASP.NET on Mac or Linux. As of this writing the current version of ASP.NET Core 1 (which was formally know as ASP.NET 5) is RC1 and that is what this article is based on.

The instructions for setting up ASP.NET Core 1 on Windows can be found here:

http://docs.asp.net/en/latest/getting-started/installing-on-windows.html

That site also has instructions for getting setup on Mac and Linux. I did this install on two different systems. It worked fine on one, but on the other, when I created a new project in Visual Studio, I got an error saying “Method not found: 'Newtonsoft.Json.Linq.JValue Newtonsoft.Json.Linq.JValue.CreateNull()'.”. I eventually figured out that I had version 6.0.0 of Json.NET installed in the GAC. I installed version 6.0.6 to the GAC and this resolved the problem.

Once you have completed the installation, start up Visual Studio 2015 and select New Project and then under C#, Web, pick ASP.NET Web Applciation. If you have installed everything properly you should see a section called ASP.NET 5 Templates.

We will start with a very basic application so select the Empty project type and be sure Host in the cloud is unchecked.

Let’s take a look at some of the files that make up the default Empty project. We will start with project.json. This file basically replaces web.config, although there is a web.config in the project which I will talk about later, and is required by the DNX runtime to execute the application.

"version": "1.0.0-*",

The first line is just a piece of metadata that declares the version of you application. You can also specify things like “authors” or “description”.

"compilationOptions": {
"emitEntryPoint": true
},

Next we have the compiler options. The only option that is included by default is “emitEntryPoint”. If this is set to true then the project will be an executable, if it’s false the project will be a DLL that since it will have no execution entry point.

"dependencies": {
"Microsoft.AspNet.IISPlatformHandler": "1.0.0-rc1-final",
"Microsoft.AspNet.Server.Kestrel": "1.0.0-rc1-final"
},

The next section, dependencies, declares the assemblies that this project needs to run. These will normally come from NuGet but also could come from other projects. The two dependencies declared by default are used to handle the web hosting of the application. As I mentioned in my previous post, ASP.NET Core 1 is not tied to any specific web server. Here we see a reference to Microsoft’s open source and cross platform Kestrel web server. This compact web server will run as part of your application and can handle requests directly or can run behind and existing web server. Note that Kestrel is not a full featured web server like IIS, other packages will need to be included in your project to do even simple things like serving static files.

The other dependency is the IISPlatformHandler, which is needed when you are running Kestrel behind IIS. In this configuration IIS simply serves as a reverse proxy passing requests back and forth to Kestrel. The IISPlatformHandler allows things like Windows Authentication to pass through to your application.

"commands": {
"web": "Microsoft.AspNet.Server.Kestrel"
},

This section defines entry points for your application that can be run from the command line or Visual Studio. When you execute the program using a command it will looks for a Main function in the specified assembly and start execution there. Commands can also have command line options that will get passed to the Main function. You can have multiple commands in a project that allow it to be run in different ways.

"frameworks": {
"dnx451": { },
"dnxcore50": { }
},

The next section defines the frameworks that the application can run with. As of this writing you have the options of three version of the full .NET Framework, dnx451, dnx452, and dnx46, and also the .NET Core Framework, dnxcore50. In the default project both core and the full framework are selected, so the application can only use features that exist in both frameworks.

"exclude": [
"wwwroot",
"node_modules"
],

In ASP.NET Core 1 you have the option of deploying all your source files to the server and they will be compiled into memory when the application is first started. Every file in the folder with a project.json file, and any subfolders will become part of the program. You can prevent files from being included in the project by setting up exclusions. Here we are excluding the contents of the wwwroot and node_modules sub directories.

"publishExclude": [
"**.user",
"**.vspscc"
]

The final section is similar to the exclude section, but this one specifies files that you don’t want published along with your project.

I mentioned earlier that even though project.json takes the place of Web.Config you will still see a Web.Config file in the wwwroot directory of the default template. This file configures the httpPlatformHandler which is an IIS module that allows it to serve as a proxy in front of another web server, in this case acting as a proxy for Kestrel. This module is only used on Windows when running behind IIS.

The final file we will look at is Startup.cs which contains that code that is called to start up your application. Let’s start with the last line:

public static void Main(string[] args) => WebApplication.Run<Startup>(args);

This is the Main function that will be called if you start the application from the DNX command line. It simply starts the web application and passes the command line arguments. If you are running the application using IIS or IIS Express this function will be skipped and WebApplication.Run will be called directly.

When the web application is started up the first function in this class that will be called is ConfigureServices. This class is used to setup any services that your application will need, for example EntityFramework, and makes these services available for dependency injection. In the empty template nothing is done in this function.

The final part of the startup process is a call to the Configure function which looks like this:

public void Configure(IApplicationBuilder app)
{
app.UseIISPlatformHandler();
app.Run(async (context) =>
{
await context.Response.WriteAsync("Hello World!");
});
}

The purpose of this function is to build the HTTP request pipeline. As I mentioned in my introductory article, ASP.NET Core 1 applications have almost no functionality out the box. They can receive HTTP requests, but without some setup they can’t do anything with them. The Configure function sets up the processing pipeline be enabling one or more piece of Middleware. Middleware receives an HTTP request, does some sort of processing based on it, and then either returns a response to the requestor, or calls the next piece of Middleware in the chain.

In the blank template you will find two pieces of Middleware. The first is the IIS Platform Handler. This Middleware is used to handle IIS specific authentication like Windows Integrated Authentication. This is only useful when running behind IIS, so it wouldn’t have any function when running on Mac or Linux. When this Middleware completes its task it sends the request on to the next piece of Middleware.

The final piece of code is the one that actually sets up what this application will do, which is to simply return the text “Hello World!” to the calling browser. The app.Run function adds and inline delegate that will be called each time a request to this application is recieved by the server. When called it calls the WriteAsync function to write the text back to the response.

In this post I showed the ground work for a very basic ASP.NET Core 1 application. In my next post we will do a little more with this application to get a better understanding of how things work.

Friday, April 29, 2016

ASP.NET Core 1 Intro

There are a lot of changes currently going on in the Microsoft development work in general and with .NET specifically, so I wanted to write some articles to look at one area in particular and that is ASP.NET. The changes going on with ASP.NET are pretty radical and exactly what is happening can be a little confusing. This is only made worse by a recent name change for the next version of ASP.NET.

Last year Microsoft announced a new vesion of ASP.NET that they called ASP.NET 5.0. Most updates to the .NET components are incremental updates that build on what came before. ASP.NET 5.0 on the other hand is a total re-engineering of ASP.NET from the ground up. To meet some of the design goals of the new version Microsoft has dropped, at least for now, some features from ASP.NET. For this reason Microsoft recently changed the name from ASP.NET 5.0 to ASP.NET Core 1.0. The new version is only at RC1 at the time this was written, but when it is released there will be two release version of ASP.NET, the full featured 4.6, and the new Core 1.0.

So what is different in ASP.NET Core 1.0. Here are some of the major bullet points.

Cross Platform Execution - The word “Core” in the new name is a reference to .NET Core which is a new version of the .NET Framework designed from the ground up to run on Windows, Mac and Linux. The new version of ASP.NET can run on either the existing 4.6 Framework, or on top of the new Core framework which allows it to be hosted on non-Windows servers. Since a lot of the things in the framework need to be updated to work cross platform, Core currently supports a lot fewer namespaces then the 4.6 framework.

Cross Platform Development - Besides running cross platform, .NET Core applications can be developed on Windows, Mac and Linux. Windows development can be done in Visual Studio as before, and all three environments can use command line tools and the new cross platform Visual Studio Code editor.

New Runtime - ASP.NET Core 1 uses a new run time called DNX. DNX is not only a key component of the cross platform support but also provides all the tools needed to build apps on each platform. DNX contains command line tools used to manage the installed frameworks, manage dependencies and execute .NET core applications.

Modular - Core 1.0 has been designed to be highly modular, you only bring in the features you need using NuGet. A bare bones ASP.NET Core 1.0 application cannot do much beyond handing HTTP requests and repsonses. If you want static file handling, there is a module for that, logging is another module, MVC yet another, etc.

Performance - The new version of ASP.NET has received a lot of performance optimizations. A lot of this performance comes from the modular nature. Since the HTTP pipeline only needs to contain what you absolutely need for your app you can have a much leaner and faster pipeline. Even with full MVC enabled Core 1.0 provides better request performance than the previous version.

Web Server - The previous versions of ASP.NET were closely tied to IIS, your application actually ran as a processes within IIS. To allow for cross platform support ASP.NET Core 1 breaks the dependancy and runs your application is a seperate process. On Windows IIS can still used, but is can serve as just a reverse proxy that passes requests back and forth to ASP.NET via the open source Kestrel web server of the more lightweight WebListener. Kestrel can also be used to run cross platform.

Open Source - ASP.NET Core 1 is open source, you can download the current source on https://github.com/aspnet. Besides being open source, the whole development process has been very open. You can watch the weekly update videos from the development team here, https://live.asp.net/.

In my next post we will take a look at some actual ASP.NET Core 1 code.

Sunday, February 21, 2016

SQL Server Indexes

When it comes to improving the performance of database queries one of the most important areas to look at is Indexes. Proper indexes can actually lead to queries running hundreds if not thousands of times faster. Despite this, indexes are often neglected by software developers. In this article I will talk about the basics of using indexes with Microsoft SQL Server.

First, let’s talk about what an index is. An index on a database table is actually a lot like the index in the back of a book. If a book didn’t have an index and I needed to find every occurrence of a certain word, I would have to read every page in the book to find them. When the book does have an index, I can quickly scan through the index and find the word I am looking for which will point me to where to find it in the book. Since the index is in alphabetical order, I don’t even need to read the whole thing, I can quickly jump to the page that has the word I am looking for.

When you create an index on a table, the server will store the indexed field or fields in alphabetical/numeric order so that any value in those fields can be rapidly found in the index. Let’s take a look at some queries and how they can be improved with indexes.

I have included a script along with this article that will create a very simple example of a sales order header table and will populate it with some semi-random data. Simply run this script on a test database and it will create a table called SalesOrder and fill it with data.

To analyze the performance of the queries we will use two tools within SQL Server Management Studio. First, before executing each query we will run the command “set statistics io on” which will show how many reads the server does from each table in the query. After the query we will run “set statistics io off” to turn this off. The second tool is the Execution Plan. You turn this on by selecting Include Actual Execution Plan from the Query menu before you run the query. We will talk about how to use these two tools in more detail when we execute our first query.

Let’s start out with this query…

set statistics io on   

select OrderNumber from SalesOrder where orderNumber = '10017'   

set statistics io off

Run this query, then click on the Messages tab to look at the statistics, you will see something like this:

Table 'SalesOrder'. Scan count 1, logical reads 59, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The two values we want to focus on are logical reads and physical reads. All data in SQL Server is stored in 8K pages, so these values are the number of pages that were read during the execution of the query. A physical read means that the page was read from disk. Once a page is read from disk it can be cached in memory in case it is needed again. Reading a cached page would be a logical read. This query did 59 logical reads, but if the data had not been accessed recently these could have been physical reads. Either way we want to try to minimize this number. Our data records are pretty small so there are going to be a lot of records in each page in this case, so you can see that 59 reads means it had to read a lot of records to get the one we were looking for.

To understand why this happened let’s look at the Execution Plan tab. If you don’t see this tab be sure “Include Actual Execution Plan” is selected on the Query menu.

This shows that to execute the query the server did a Table Scan which means I had to read every row in the table to find the one we are looking for. It doesn’t know that OrderNumber is unique so it doesn’t stop even when it finds the record because it assumes that there might be more than one record with the same order number.

Now let’s create an index that will make this query more efficient. This query will create an index on the SalesOrder table for the OrderNumber column. Don’t worry about the NONCLUSTERED keyword, we will talk about this later.

CREATE NONCLUSTERED INDEX IX_SalesOrder ON dbo.SalesOrder ( OrderNumber )

With the index created let’s re-run the original query and look at the results. Here is the statistics output:

Table 'SalesOrder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

You can see that the number of reads has dropped from 59 down to only 2 which is a considerable improvement. Let’s look at the execution plan:

The Table Scan has changed to an Index Seek. Index Seeks are the best case for a query. This means it’s using the index to rapidly find the record(s) we are looking for.
So far we have only been returning the OrderNumber, so now let’s return both the OrderNumber and the Customer:

set statistics io on   

select OrderNumber,Customer from SalesOrder   
where orderNumber = '10017'   

set statistics io off

If you look at the statistics you will see it has gone up from two reads to three. Let’s look at the execution plan to see why.

The query still uses the Index Seek but we now have a second step which is a Lookup. To understand this you need to understand how SQL Server stores indexes. When a NONCLUSTERED index is created the values in the field being indexed are copied to another part of the database in sorted order. Each of these index records also has a pointer back to the original data which is where all the fields are stored. Since the index and the data are stored in different pages in the database, once the value we are looking for is found in the index, the server must read the corresponding data page to get the customer field. This lookup isn’t a big issue for this query since we only have one record to lookup, but it would become a bigger problem if we were returning multiple records. One way we could fix this is to add the Customer field to the index.

drop INDEX IX_SalesOrder ON dbo.SalesOrder    
CREATE NONCLUSTERED INDEX IX_SalesOrder_1 ON dbo.SalesOrder     
(     
OrderNumber,     
Customer     
)

This query drops and re-creates the index so it includes both the OrderNumber and Customer. If you re-run the previous query you will see that the lookup goes away. The server uses the index to find the order you are looking for, and since the customer is part of the index it no longer needs to lookup the data page to get the customer field value.

Now that we have Customer as part of an index, let’s try to run a query on it.

select OrderNumber from SalesOrder    
where Customer = '30075'

Here is the execution plan:

Notice that we are now getting an Index Scan, instead of a Seek. When you create an index with multiple fields the index record is created by appending the two fields together in the order you specified them when you created the index. So if a record has a customer of 30075 and an order number 10048, the index essentially stores “1004830075”. Since the index is sorted by the order number first, it cannot use an Index Seek to find the customer number, but it can scan through all the index records to find the customer. The index Scan is preferable to a Table Scan because an index record will usually be much smaller than the table record, so there are a lot more records per page which results in fewer page reads. This query required 40 page reads instead of the 59 that was required for the whole table. This difference would get even larger if there were more fields in the table.

Now let’s create an index on the customer:

CREATE NONCLUSTERED INDEX IX_SalesOrder_Cust ON dbo.SalesOrder   
(   
Customer   
)

and then run a query that returns all the order numbers for a specific customer:

select OrderNumber from SalesOrder    
where Customer = '30075'

If you look at the execution plan you will notice something odd, it’s ignoring the index and still doing an index scan:

The reason this happens is that when SQL Server generates an execution plan it not only considers the structure of the table and indexes, but also looks at statistics about the data in the index. The details of Statistics are outside the scope of this article, but for the purpose of this query I will say that the statistics for the Customer index will tell SQL that customer numbers often appear multiple times in the table. Since the server needs to do a lookup to get the order number for each record, it assumes that it will just be faster to scan the order index, then it would be to use the customer index and do the lookups. I can re-run the query and force SQL Server to use the customer index like this:

set statistics io on    
select OrderNumber from SalesOrder with (index(IX_SalesOrder_Cust))     
where Customer = '30075'     
set statistics io off

The version of the query that did an index scan had to do 40 reads, but when I force this query to use the customer index it required 92 reads, so in this case the server guessed correctly that the table scan is more efficient. The customer number in the test data is randomly generated so using the index will not always have been less efficient. If you happen to use a customer number that only has one or two orders, then the index version would be faster. SQL Server is making a best guess at what is most efficient, and even though certain executions of the query may end up being less efficient, in the long run SQL’s guess will likely result it better query performance.

To get SQL to use the index in this case we could do what we did with the order number index, and add the customer to it, but since we already have an index on the order number it’s probably not worthwhile to have the order number as part of the customer index, so there is a second option for handling this:

drop index [IX_SalesOrder_Cust] ON [dbo].[SalesOrder]

CREATE NONCLUSTERED INDEX [IX_SalesOrder_Cust] ON [dbo].[SalesOrder]  
(  
[Customer] ASC  
)  
INCLUDE ([OrderNumber])

In this index I have added the INCLUDE clause which adds the OrderNumber to the index, but doesn’t actually make it part of the indexed field. Now SQL can use this index without having to do a lookup to get the OrderNumber.

You might wonder why you would use INCLUDE instead of just making the field part of the index? If you are not going to use the field in a WHERE clause, then having it as an INCLUDE is more efficient since the server doesn’t need to consider the value when updating the index which helps improve the performance of inserts, updates and deletes. Another advantage is that you can INCLUDE type of fields which cannot be indexed, Text fields for example.

So far we have been limiting the fields we return in the select, so let’s trying retrieving all the fields:

select * from SalesOrder   
where Customer = '30075'

This will once again result in a table scan. Even though we have an index on Customer, and SalesOrder is included in the index it would still need to do a lookup to get the rest of the fields, so it decides that a table scan would be more efficient. We can solve this issue by using a Clustered Index.

So far all the indexes we have built have been non-clustered indexes. As I explained earlier a non-clustered index stores the indexed fields in another part of the database with a pointer back to the actual data. By contrast, a clustered index stores the actual data records in the index order so no matter what fields we are selecting the lookup is not needed since the index and data are all in the same place. Since this index controls the order of the data, there can only be one clustered index per table because the table can only be sorted one way.

Let’s drop the original customer index and re-create it as clustered. The include is no longer needed since a clustered index effectively has all the fields included.

drop index [IX_SalesOrder_Cust] ON [dbo].[SalesOrder]   
CREATE CLUSTERED INDEX IX_SalesOrder_Cust ON dbo.SalesOrder    
(    
Customer    
)

Now if you re-run the query it will do an index seek.
Now let’s try this query.

set statistics io on   
select * from SalesOrder    
where orderDate = '1/1/2015'    
set statistics io off

Because every field is part of the clustered index, the execution plan will show that it did a clustered index scan. The statistics on this query show 66 reads which is 2 more reads then a table scan. I have read that due to the way the data is organized that the scan is actually a little more efficient then a table scan but index scans on a non-clustered index will normally be much more efficient then on a clustered index.

Everything I have shown here so far is for conditions in the WHERE clause, but this all applies to joins also, just think of the ON part of the join as being a WHERE. These indexes will also come into play when you do updates or deletes with a WHERE clause.

Dan's .NET Blog