Profile cover photo
Profile photo
EaseFilter Inc.
About
Posts

Post has attachment
EaseTag Tiered Storage Filter Driver SDK

The Challenge

For most companies, data – especially unstructured data, continues to grow by 50 percent annually. The impact of spending more every year on storage, and on protecting and managing information has often pushed IT departments to their limits. Combined with longer mandated retention periods, today’s information management challenges are forcing IT staff, from the vice president to the systems administrator, to reduce complexity and cost without putting their organization’s infrastructure, information, and intellectual property at risk. EaseFilter Inc. developed an EaseTag Tiered Storage SDK to help you seamlessly migrate your storage to cloud.

EaseTag Tiered Storage SDK

EaseTag Tiered Storage (hierarchical storage management, HSM) Filter Driver SDK, is a data storage technique which automatically moves data between high-cost and low-cost storage media, such as network attached storage(NAS),optical discs and cloud storage. A stub is created for and replaces each migrated file in the fast disk drives. On the local system, a stub file looks and act like a regular file. When the user application accesses a migrated file stub, the Windows operating system transparently directs a file access request to the EaseTag Tiered Storage SDK. The EaseTag driver will send the request to the remote site to retrieve the data back from the repository to which it was migrated(see Figure 1).

easetag

Figure 1. Tiered Storage Data Flow Chart

The automated tiered storage can integrate with existing applications, without affecting the original data and programs. Without any modification of existing applications, the local storage can automatically be extended to the network storage.

Tiered storage can be widely used in telecommunications, government, oil, medical and other industries. Tiered storage is the first choice of medical PACS (Picture Archiving and Communication System, medical imaging storage and transmission systems), a lot of data in such applications are rarely visit, these data are transferred to a less expensive network storage. When users and applications access the stub files in the local storage, it is completely transparent, the system will automatically restore the data back to the stub file from the network storage server. The network attached storage is scalable, tiered storage products provide users with an infinite online data space.

The main advantages of EaseTag Tiered Storage are:

Lower Storage Costs

If you have two terabytes of expensive server storage where 50% of the data is never or rarely accessed, with EaseTag Tiered Storage, you can transfer a terabyte data to the NAS storage or cloud storage, you can save a terabyte storage space in SAN RAID storage.

To maximize the server's hard disk available space

Reclaim storage space without disrupting users. Set up policies to automatically remove older files from file servers, cleaning up disk space, and replace them with an intelligent shortcut "stub" that invisibly retrieves the original file from the archive.

To improve efficiency

When the user needs these data, it can be accessed transparently in real time. If you need to recover and restore a file that was accidently deleted or modified in error, you can restore the file or even a whole folder from the repository.

Reduce the server backup time and recovery time, only need to backup frequently used files.

Improve data security

the data in the server can be encrypted, and access these data through the storage management software, only authorized users can access the data, and can log the access activities.

To remove duplicate data, the storage server only keep single instance

Best-practice cloud storage deployment procedures

The following sections provide some best-practice procedures to assist you in planning and implementing your cloud storage deployment.

1. Set up migration policies:

a) Migrate and stub files based on file type, for example only process all .JPG, .PPT, .BMP, .GIF, .TIF, .PDF, and .PSD files.

b) Migrate and stub files based on file size, for example only process file size greater than 10MB files.

c) Migrate and stub files based on file time stamp, for example only process file creation/modified/last access time greater than 60 days

The below example shows how to migrate the files

The volume ArchiveVolume total capacity is 3.34TB, there are a folder calls ArchiveFolder, there are two files, one is 100GB, another one is 10GB, the total space on disk is 110Gb,they are very big two files(see Figure 2 and 3).

prevolume

Figure 2. The volume and folder properties before migration

prefile

Figure 3. The file property before migration

2. Data migration and create stub files

After the files were transferred to the cloud data center, then make these files to stub files ( see Figure 5). The stub file only take 4k physical space, and the file size didn’t change, compare the Figure 2 and Figure 4, it shows that the volume gains 110 GB free space back, that’s because the original physical files data space were released. The stub files have the offline icon which will let the backup software and anti-virus software don’t read these files.

postvolume

Figure 4. The volume and folder properties after migration

postfile

Figure 5. The file property after migration

3. Access stub files or restore files

When the user application access the stub files, the EaseTag tiered storage SDK can read back a block of data of the file, this method is especially good to the large file, it is much faster than restore back the whole file. For example, the application only wants to read back 64kb data in offset 0 for a file size is 100GB, it will only return the 64Kb data. For some applications, they need to read the whole file, the SDK also can restore the whole file in the first read.

4.The data disposition

When the files don’t use any more, it can be disposed from the repository. The data will be deleted permanently .

About EaseFilter Inc.(http://www.easefilter.com)

We specialize in file system filter driver development. We architect, implement and test file system filter drivers for a wide range of functionality. We can offer several levels of assistance to meet your specific needs:

Provide consulting service for your existing file system filter driver.
Customize the SDK to meet your requirement.
Create your own filter driver with our source code.

To download the EaseTag Tiered Storage SDK click here.
Add a comment...

What is reparse point?

A reparse point is an object in a file system with attributes that activate extended functionality. A tag in the reparse point indicates the location from which external information should be taken and specifies an application associated with that information.

A single file can contain more than one reparse point, with each point involving a different application. When a system opens a file and encounters a reparse point, the system finds the filter associated with the application indicated in the tag. The data in the reparse point can then be used to transparently execute whatever task is specified, through the application that created the reparse point.

. The use of reparse points begins with applications. An application that wants to use the feature stores data specific to the application--which can be any sort of data at all--into a reparse point. The reparse point is tagged with an identifier specific to the application and stored with the file or directory. A special application-specific filter (a driver of sorts) is also associated with the reparse point tag type and made known to the . More than one application can store a reparse point with the same file or directory, each using a different tag. Microsoffile systemt themselves reserved several different tags for their own use.

Now, let's suppose that the user decides to access a file that has been tagged with a reparse point. When the file system goes to open the file, it notices the reparse point associated with the file. It then "reparses" the original request for the file, by finding the appropriate filter associated with the application that stored the reparse point, and passing the reparse point data to that filter. The filter can then use the data in the reparse point to do whatever is appropriate based on the reparse point functionality intended by the application. It is a very flexible system; how exactly the reparse point works is left up to the application. The really nice thing about reparse points is that they operate transparently to the user. You simply access the reparse point and the instructions are carried out automatically. This creates seamless extensions to file system functionality.

In addition to allowing reparse points to implement many types of custom capabilities, Microsoft itself uses them to implement several features within Windows 2000 itself, including the following:

Symbolic Links: Symbolic linking allows you to create a pointer from one area of the directory structure to the actual location of the file elsewhere in the structure. NTFS does not implement "true" symbolic file linking as exists within UNIX file systems, but the functionality can be simulated by using reparse points. In essence, a symbolic link is a reparse point that redirect access from one file to another file.

Junction Points: A junction point is similar to a symbolic link, but instead of redirecting access from one file to another, it redirects access from one directory to another.

Volume Mount Points: A volume mount point is like a symbolic link or junction point, but taken to the next level: it is used to create dynamic access to entire disk volumes. For example, you can create volume mount points for removable hard disks or other storage media, or even use this feature to allow several different partitions (C:, D:, E: and so on) to appear to the user as if they were all in one logical volume. Windows 2000 can use this capability to break the traditional limit of 26 drive letters--using volume mount points, you can access volumes without the need for a drive letter for the volume. This is useful for large CD-ROM servers that would otherwise require a separate letter for each disk (and would also require the user to keep track of all these drive letters!)

Remote Storage Server (RSS): This feature of Windows 2000 uses a set of rules to determine when to move infrequently used files on an NTFS volume to archive storage (such as CD-RW or tape). When it moves a file to "offline" or "near offline" storage in this manner, RSS leaves behind reparse points that contain the instructions necessary to access the archived files, if they are needed in the future.

These are just a few examples of how reparse points can be used. As you can see, the functionality is very flexible. Reparse points are a nice addition to NTFS: they allow the capabilities of the file system to be enhanced without requiring any changes to the file system itself.
Add a comment...

Post has attachment
What is Sparse File?

A sparse file has an attribute that causes the I/O subsystem to allocate only meaningful (nonzero) data. Nonzero data is allocated on disk, and non-meaningful data (large strings of data composed of zeros) is not. When a sparse file is read, allocated data is returned as it was stored; non-allocated data is returned, by default, as zeros.

NTFS deallocates sparse data streams and only maintains other data as allocated. When a program accesses a sparse file, the file system yields allocated data as actual data and deallocated data as zeros.

NTFS includes full sparse file support for both compressed and uncompressed files. NTFS handles read operations on sparse files by returning allocated data and sparse data. It is possible to read a sparse file as allocated data and a range of data without retrieving the entire data set, although NTFS returns the entire data set by default.

With the sparse file attribute set, the file system can deallocate data from anywhere in the file and, when an application calls, yield the zero data by range instead of storing and returning the actual data. File system application programming interfaces (APIs) allow for the file to be copied or backed as actual bits and sparse stream ranges. The net result is efficient file system storage and access. Next figure shows how data is stored with and without the sparse file attribute set.

Figure 5-4 Windows 2000 Data Storage

sparse file
Add a comment...

Post has attachment

What is a stub file?

A stub file looks and acts like a regular file. It has the same file attributes with the original physical file (file size, creation time, last write time, last access time). It also keeps the original file's security. The difference between the stub file and the normal physical file is the stub file doesn't take any physical space, looks like a 0 kb file.

here is the stub file snapshot:
Image

A typical stub file has these attributes: offline attribute ( it tells the other applications the file data is not in local, e.g.,antivirus software, backup software they can skip these files); reparse point attribute, you can put your data up to 14kb to the reparse point tab; sparse file attribute.

When you or a Windows® application accesses a migrated file stub, the Windows operating system transparently directs a file access request to the HSM for Windows client file system filter driver. This driver retrieves the full file from the repository to which it was migrated.

When a file is restored but not changed, that file is "re-stubbed" during the next migration process.

When a file is recalled, modified, and migrated again, that new version of the file is stored in the Storage Manager storage.
Photo
Add a comment...

Post has attachment
PROTECT THE SENSITIVE FILES WITH FILE SYSTEM FILTER DRIVER

INTRUCDUTION
Unauthorised access to the information on your computer or portable storage devices can be carried out remotely, if the 'intruder' is able to read or modify your data over the Internet; or physically, if he manages to get hold of your hardware. You can protect yourself against either type of threat by improving the physical and network security of your data, as discussed in How to protect your computer from malware and hackers and How to protect your information from physical threats. It is always best to have several layers of defence, however, which is why you should also protect the files themselves. That way, your sensitive information is likely to remain safe even if your other security efforts prove inadequate.

There are two general approaches to the challenge of securing your data in this way. You can encrypt your files, making them unreadable to anyone but you, or you can hide them in the hope that an intruder will be unable to find your sensitive information..

ENCRYPTING YOUR INFORMATION
Storing confidential data can be a risk for you and for the people you work with. Encryption reduces this risk but does not eliminate it. The first step to protecting sensitive information is to reduce how much of it you keep around. Unless you have a good reason to store a particular file, or a particular category of information within a file, you should simply delete it (see How to destroy sensitive information for more information about how to do this securely).

Encrypting your information is a bit like keeping it in a locked safe. Only those who have a key or know the lock's combination (an encryption key or password, in this case) can access it. The analogy is particularly appropriate for TrueCrypt and tools like it, which create secure containers called 'encrypted volumes' rather than simply protecting one file at a time. You can put a large number of files into an encrypted volume, but these tools will not protect anything that is stored elsewhere on your computer or USB memory stick.

FILE SYSTEM FILTER DRIVER
A file system filter driver can intercept requests targeted at a file system or another file system filter driver. By intercepting the request before it reaches its intended target, the filter driver can extend or replace functionality provided by the original target of the request. With the filter driver you can monitor or control the file system activities on the fly. You can intercept the file open/create/replace, read/write, query/set file attribute/size/time security information, rename/delete, directory browsing and file close requests.

EASEFILTER FILE SYSTEM CONTROL FILTER DRIVER
EaseFilter file system control filter can control the file activities, which you can intercept the file system call, modify its content before or after the request goes down to the file system, allow/deny/cancel its execution based on the filter rule. You can fully control file open/create/overwrite, read/write, query/set file attribute/size/time security information, rename/delete, directory browsing these Io requests.

EASEFILTER FILE SYSTEM ENCRYPTION FILTER DRIVER
EaseFilter File system encryption filter driver SDK provides a comprehensive solution for transparent file level encryption. It allows developers to create transparent encryption products which it can encrypt or decrypt files on-the-fly, it can allow only authorized users or processes can access the encrypted files.

Supported strong cryptographic algorithm Rijndael is a high security algorithm which was chosen by the National Institute of Standards and Technology (NIST) as the new Advanced Encryption Standard (AES), it can support key length 128-bits,192-bits and 256-bits.
Add a comment...

Post has attachment

Virtual address spaces

When a processor reads or writes to a memory location, it uses a virtual address. As part of the read or write operation, the processor translates the virtual address to a physical address. Accessing memory through a virtual address has these advantages:

A program can use a contiguous range of virtual addresses to access a large memory buffer that is not contiguous in physical memory.

A program can use a range of virtual addresses to access a memory buffer that is larger than the available physical memory. As the supply of physical memory becomes small, the memory manager saves pages of physical memory (typically 4 kilobytes in size) to a disk file. Pages of data or code are moved between physical memory and the disk as needed.

The virtual addresses used by different processes are isolated from each other. The code in one process cannot alter the physical memory that is being used by another process.

The range of virtual addresses that is available to a process is called the virtual address space for the process. Each user-mode process has its own private virtual address space. For a 32-bit process, the virtual address space is usually the 2-gigabyte range 0x00000000 through 0x7FFFFFFF. For a 64-bit process, the virtual address space is the 8-terabyte range 0x000'00000000 through 0x7FF'FFFFFFFF. A range of virtual addresses is sometimes called a range of virtual memory.

This diagram illustrates some of the key features of virtual address spaces.

Virtual Address Space

The diagram shows the virtual address spaces for two 64-bit processes: Notepad.exe and MyApp.exe. Each process has its own virtual address space that goes from 0x000'0000000 through 0x7FF'FFFFFFFF. Each shaded block represents one page (4 kilobytes in size) of virtual or physical memory. Notice that the Notepad process uses three contiguous pages of virtual addresses, starting at 0x7F7'93950000. But those three contiguous pages of virtual addresses are mapped to noncontiguous pages in physical memory. Also notice that both processes use a page of virtual memory beginning at 0x7F7'93950000, but those virtual pages are mapped to different pages of physical memory.
User space and system space

Processes like Notepad.exe and MyApp.exe run in user mode. Core operating system components and many drivers run in the more privileged kernel mode. For more information about processor modes, see User mode and kernel mode. Each user-mode process has its own private virtual address space, but all code that runs in kernel mode shares a single virtual address space called system space. The virtual address space for the current user-mode process is called user space.

In 32-bit Windows, the total available virtual address space is 2^32 bytes (4 gigabytes). Usually the lower 2 gigabytes are used for user space, and the upper 2 gigabytes are used for system space.



In 32-bit Windows, you have the option of specifying (at boot time) that more than 2 gigabytes are available for user space. The consequence is that fewer virtual addresses are available for system space. You can increase the size of user space to as much as 3 gigabytes, in which case only 1 gigabyte is available for system space. To increase the size of user space, use BCDEdit /set increaseuserva.

In 64-bit Windows, the theoretical amount of virtual address space is 2^64 bytes (16 exabytes), but only a small portion of the 16-exabyte range is actually used. The 8-terabyte range from 0x000'00000000 through 0x7FF'FFFFFFFF is used for user space, and portions of the 248-terabyte range from 0xFFFF0800'00000000 through 0xFFFFFFFF'FFFFFFFF are used for system space.



Code running in user mode has access to user space but does not have access to system space. This restriction prevents user-mode code from reading or altering protected operating system data structures. Code running in kernel mode has access to both user space and system space. That is, code running in kernel mode has access to system space and the virtual address space of the current user-mode process.

Drivers that run in kernel mode must be very careful about directly reading from or writing to addresses in user space. This scenario illustrates why.

A user-mode program initiates a request to read some data from a device. The program supplies the starting address of a buffer to receive the data.

A device driver routine, running in kernel mode, starts the read operation and returns control to its caller.
Later the device interrupts whatever thread is currently running to say that the read operation is complete. The interrupt is handled by kernel-mode driver routines running on this arbitrary thread, which belongs to an arbitrary process.
At this point, the driver must not write the data to the starting address that the user-mode program supplied in Step 1. This address is in the virtual address space of the process that initiated the request, which is most likely not the same as the current process.

Paged pool and Nonpaged pool

In user space, all physical memory pages can be paged out to a disk file as needed. In system space, some physical pages can be paged out and others cannot. System space has two regions for dynamically allocating memory: paged pool and nonpaged pool. In 64-bit Windows, paged pool is the 128-gigabyte range of virtual addresses that goes from 0xFFFFA800'00000000 through 0xFFFFA81F'FFFFFFFF. Nonpaged pool is the 128-gigabyte range of virtual addresses that goes from 0xFFFFAC00'00000000 through 0xFFFFAC1F'FFFFFFFF.

Memory that is allocated in paged pool can be paged out to a disk file as needed. Memory that is allocated in nonpaged pool can never be paged out to a disk file.

Add a comment...


File System Filter Drivers

File system filter drivers are often the topic of some interesting discussions when working on server performance issues. Understanding how a file system filter driver works is the topic of today’s post. We’ll also quickly discuss one of the most common issues that we see - especially when dealing with Anti-Virus filter drivers and updates.

Simply put, a file system filter driver is a driver that sits on top of the file system and examines requests made to the file system to determine how (and in some cases, IF) the request should be handled. Different applications such are remote file replication services and file encryption use filter drivers, but the one with which we are all familiar is the Anti-Virus filter driver.

Let’s look at an example of how this works when real-time scanning is enabled. When an application tries to open a file, the filter driver intercepts the request and examines the file being opened to ensure that it does not have a virus. If the file is clean, then the request is sent on to the file system. However, if the file is infected, then the virus scanner notifies its associated Windows service process to quarantine or clean the file. If the file cannot be cleaned, then the filter driver fails the request (usually with an Access Denied error) so that the virus cannot become active.

Now, you’re probably asking yourself, “That’s great, but what does this have to do with server performance?” If a file system filter driver is not functioning properly, requests may get stuck, time out or fail – and not because the file being accessed is infected with a virus. From the user’s perspective, access to their files (usually across the LAN / WAN) appears to be incredibly slow, or the files may appear to be inaccessible. For those of you that have worked with our Support Engineers on issues like this, one of our common lines of questioning concerns how Anti-Virus, specifically On-Access or Real-Time scanning, is configured. Which brings us to the second part of our post … the most common “gotcha” that we see with respect to the Anti-Virus filter driver and updates …

When updating Anti-Virus, the primary concern is ensuring that the Anti-Virus signature file is current to guard against emerging threats and existing viruses. However, although keeping your signature file current is obviously important, it is equally important to ensure that your Anti-Virus file system filter driver is kept up to date as well. We have had more than a few issues where a customer has reported Pool memory depletion or a server soft hang, and after investigating, the culprit turned out to be an outdated file system filter driver for the Anti-Virus software. As part of your maintenance routine, when keeping an eye out for updated drivers and firmware for your servers, you should also keep an eye out to make sure that you are running the latest file system filter drivers for your Anti-Virus as well.
Add a comment...


What Is the Kernel Filter Manager?

The kernel of a computer operating system is its core, the heart that controls everything around it. Microsoft based early versions of the Windows OS on the DOS operating system, but switched to a kernel-based system for Windows NT and 2000. The NT kernel has been the basis for subsequent OS versions. The kernel filter manager enables Windows' two modes, kernel mode and user mode, to communicate

When a central processing unit operates in kernel mode, whatever code the CPU runs has direct access to the system's underlying hardware and memory. In user mode, code can only gain access to the inner workings by going through an application programming interface. The CPU hardware keeps the two modes distinct. When they need to share information, the filter manager connects them through communication ports, allowing for a fast exchange of data between them.

Filter Manager

The filter manager works on more than just the kernel: It works with the entire file system, creating minifilters to act as drivers, programs that let software interact with hardware. The manager assigns minifilters to work at a particular point in the input/output memory space or stack. When the user mode and kernel mode need to communicate, a minifilter opens a port, specifies a security level and listens for connection attempts. If the user-mode caller has sufficient security, the filter manager allows the connection. When communication ends, the filter manager closes the connection.

Kernel Stack

The stack for the kernel mode has a limited amount of memory space. The amount is determined by the operating system and can't be modified. Because the stack is limited, Microsoft recommends users conserve as much space in the stack as possible. The filter manager helps; Microsoft has optimized it to use the least amount of stack space and recursive calls -- ones that reference more than one program -- made through the filter manager don't place as heavy a demand on the stack as when made by other methods.

Complexity

Another advantage to employing a filter manager is that it comes with support routines that help with common computer functions, such as kernel and user mode communications. This simplifies filtering requests. Most filter-manager drivers run in user mode because it's safer for the computer. If a driver crashes in kernel mode, the result can take down the entire system. In user mode, the only thing that crashes is whatever software the user employs at the time.
Add a comment...


IRP Function Codes

This section summarizes the aspects of handling and operation of I/O request packet (IRP) function codes that are specific to file system drivers and file system filter drivers.
For detailed information about how to write dispatch routines that perform the tasks that are described in the "Operation: File System Drivers" and "Operation: File System Filter Drivers" subsections of each topic, see Writing IRP Dispatch Routines.
For more general information about these requests, see IRP Major Function Codes.

In this section

IRP_MJ_CLEANUP
IRP_MJ_CLOSE
IRP_MJ_CREATE
IRP_MJ_DEVICE_CONTROL
IRP_MJ_DIRECTORY_CONTROL
IRP_MJ_FILE_SYSTEM_CONTROL
IRP_MJ_FLUSH_BUFFERS
IRP_MJ_INTERNAL_DEVICE_CONTROL
IRP_MJ_LOCK_CONTROL
IRP_MJ_PNP
IRP_MJ_QUERY_EA
IRP_MJ_QUERY_INFORMATION
IRP_MJ_QUERY_QUOTA
IRP_MJ_QUERY_SECURITY
IRP_MJ_QUERY_VOLUME_INFORMATION
IRP_MJ_READ
IRP_MJ_SET_EA
IRP_MJ_SET_INFORMATION
IRP_MJ_SET_QUOTA
IRP_MJ_SET_SECURITY
IRP_MJ_SET_VOLUME_INFORMATION
IRP_MJ_SHUTDOWN
IRP_MJ_WRITE
Add a comment...


File System Control Codes

This section describes file system control (FSCTL) codes that can be processed by file systems, file system filter and minifilter drivers, and file system network redirectors. The following FSCTL codes are currently documented for kernel-mode drivers.
In this section

FSCTL_DELETE_REPARSE_POINT
FSCTL_DISMOUNT_VOLUME
FSCTL_FILE_LEVEL_TRIM
FSCTL_FIND_FILES_BY_SID
FSCTL_GET_BOOT_AREA_INFO
FSCTL_GET_REPARSE_POINT
FSCTL_GET_RETRIEVAL_POINTER_BASE
FSCTL_GET_RETRIEVAL_POINTERS
FSCTL_INVALIDATE_VOLUMES
FSCTL_IS_PATHNAME_VALID
FSCTL_IS_VOLUME_DIRTY
FSCTL_LMR_GET_LINK_TRACKING_INFORMATION
FSCTL_MARK_AS_SYSTEM_HIVE
FSCTL_MARK_VOLUME_DIRTY
FSCTL_OFFLOAD_READ
FSCTL_OFFLOAD_WRITE
FSCTL_OPBATCH_ACK_CLOSE_PENDING
FSCTL_OPLOCK_BREAK_ACK_NO_2
FSCTL_OPLOCK_BREAK_ACKNOWLEDGE
FSCTL_OPLOCK_BREAK_NOTIFY
FSCTL_QUERY_FILE_LAYOUT
FSCTL_QUERY_PERSISTENT_VOLUME_STATE
FSCTL_QUERY_RETRIEVAL_POINTERS
FSCTL_REQUEST_BATCH_OPLOCK
FSCTL_REQUEST_FILTER_OPLOCK
FSCTL_REQUEST_OPLOCK
FSCTL_REQUEST_OPLOCK_LEVEL_1
FSCTL_REQUEST_OPLOCK_LEVEL_2
FSCTL_SET_PERSISTENT_VOLUME_STATE
FSCTL_SET_REPARSE_POINT
Add a comment...
Wait while more posts are being loaded