Python distributed file system

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Building a Distributed File System with Basic components like Directory Service, Locking Service and Caching Service.

AshwathSalimath/Distributed-File-System-Python

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

CS7NS1 Individual Project — Distributed File System, Student ID: 17306521

To deliver a distributed file system implementation.

Note: I am still working on this. I will provide documentation about Working of the Project soon.

List of Developed Modules

This is the core of any distributed file system and consists of a TCP server which provides access to files on the machine on which it is executed and a client side file service proxy that provides a language specific interface to the file system.

The directory service is responsible for mapping human readable, global file names into file identifiers used by the file system itself. A user request to open a particular file X is passed by the client proxy to the directory server for resolution.

Caching is a vital element of any file system design that is required to give good performance and scale.

This server simply holds a semaphore for each file it is told about. Any client wishing to access a file could simply ask for access from the lock server.

About

Building a Distributed File System with Basic components like Directory Service, Locking Service and Caching Service.

Источник

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

A distributed file system implemented in python

lavelle96/distributed-file-system

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

A RESTful distributed NFS implemented in python that implements features such as transparent file access, locking, caching, a directory service, file replication and authentication.

Running the system

  • The system uses linux commands to open files (when reading files, it opens them in the default program for that file type for the OS).
  • Mongo db is needed to run the system
  • Each of the servers have a bash script in the src folder for running it
  • The default ports they start on are in the config files however these can be changed for every file except for the registry server port, that has to be known by all servers.
  • The database must be set up first
  • Followed by the registry server
  • Followed by every server except for the file servers
  • The file servers can then be started
  • By this stage the system is up and running and the client program can be started by calling python Client.py

Basic Structure

The database used was mongodb, it is a document orientated database technology. It is very flexible and stores objects as json-like objects which makes it easy to deal with, especially using api’s when json is such an easy method to pass information around.

  • A client using the file system has six options upon startup, they can read, write, create and delete files, they can also find out which files are stored on the file system by using the ‘show’ command and find information on the syntax of the commmands using the ‘help’ command.
  • If the client has admin clearance, it has two more options — to add another user to the database and to list the currently registered users.
  • The client does not need any information about what goes on behind these commands, except for the address of the Registry Server, this is the server that stores the whereabouts of each other server in the system. This address will be constant throughout the lifetime of the system.
  • For example to read a file, the syntax is as as simple as read and everything else is handled under the hood.
< 'dir_name': 'dir_port': 'dir_load': (used to decide which file server will handle the request) >
  • When a server comes online it sends a post request to the registry server and that server is added to the database.
  • Has a thread that routinely checks the state of servers, if a server goes down, the registry server will find out and remove it from the database. Each server has an endpoint to allow requests for its current state (whether or not it is still alive) to accommodate this. That same endpoint also allows servers to be shut down with a delete request.
  • Also provides load balancing for all servers except the file servers (this balancing is handled by the directory server). So for example if something is looking for the address of a directory server and there are two directory servers online, the address of the server that has handled the least amount of requests will be returned
  • The file servers are completely stateless
  • Endpoint for file manipulation:
    • Get — Reading a file
    • Post — Writing/Creating a file, also sends this request on to the directory server so that it is replicated across all file servers that support the file.
    • Delete — Deleting a file, provides the same replication functionality as the post.
    • The directory server provides a lot of functionality for managing the files servers
    • Has two json file structures stored in the database:
      • Active nodes: Keeps track of the address of the file servers that are online, which files they’re supporting and the load on that file server. Structure:
      • File map: Keeps track of the files that exist in the file system, the amount of file servers each is supported by and the associated addresses of those file servers. The version of the file is also stored. Structure:
      < 'file_name': 'num_nodes': 'ports': 'file_version': >
      • Get — Gets the port of a specified file, if more than one server support the file, the address of the server with the lesser load will be returned
      • Post — Creates or updates a file in the database, this request is sent by a file server after a file server updates or creates a file to ensure consistency in the database and also to allow replication of the changes across other file servers that are supporting that file. When the directory server receives this request it sends a write request on to all file servers that support the file.
      • Delete — Deletes a file from the database and deletes it from every node that supports it (used in a similar fashion to the post request).
      • Post — File servers post to this endpoint with the files that they support when they come online, this information is then stored in the database.
      • Delete — Deletes a file server from the directory servers database, usually sent by the file server before it goes down but also sent when the registry server notices a file server is down.
      • Get — returns all files in the database.
      • Get — returns the version of the specified file.
      • Has one json file structure stored in a database, this stores the filenames of files that have been locked before or are currently locked, in these objects it also stores whether or not the file is locked and the queue containing the addresses of those looking to lock the file. Here is the basic structure:
      < 'file_name': 'is_locked': 'lock_holders': [queue of client ids] >
Оцените статью