Monthly Archives: September 2017

6 Reasons That Proves Smart Service Is Required For Cloud Transformation

It is quite significant to understand the implications of making certain choices for being benefited from the cloud technology. For producing better outcome a smart service provider must be installed and it can avoid bad decisions. While transferring to the cloud might be easy and the actual execution can lead to serious headaches. The medicine you require for curing your migraine before it starts is done with the help of Smart Service.

Consider The Following Stuff Before You Execute Your Cloud Transformation:

1. For transferring your data and applications it might seem very attractive to a public cloud provider and you need to look at the consequences which are possible to occur. In return of convenience, you need to hand over the controls along with a public cloud provider. You have the benefits of the cloud along with your own private cloud while maintaining full control and ownership. A well-trained IT staff and a significant investment are required for doing this. Depending on your restrictions and requirements this solution is the best for you.

2. For planning and building, a private cloud consumes a lot of time. A review of the present data center is begun especially for revealing the kind of equipment that is actually re-used for the cloud. There is no need to touch legacy environment if you choose to construct entirely new cloud data centers.

3. After the determination of desired end state you may need to shift the data and applications to the new cloud environment and with minimal impact on zero data loss and productivity, this task must be accomplished. There are lots of preparations required for this and it can be quite complex. With the migration of applications, the data transfer needs to be aligned for avoiding synchronization problems.

4. For adding or removing resources the benefits of the cloud solution is dynamic. There are lots of ways to do this and it is called Advanced, Dynamic or user provisioning. In real time and fully automatic resources the dynamic provisioning is provided. In advance, the available resources are got by the users with respect to advanced provisioning. There are lots of solutions that has various consequences and a different price tag. A good understanding of pros and cons is required for making the right choice.

5. It takes a lot of time for a proper transition of cloud but for years of completion for a particular transition, it means high cost, frustration, and risk of project abandonment. Benefits are achieved more rapidly and disruptions end sooner.

6. For managing the cloud transition of project needs a financial investment for engaging a specialist firm which done using an internal staff is very much attractive. Instead of hiring an external provider why not use the resources that you are already paying for? The reason might be that there won’t be a cloud migration done before with the help of an internal staff.

Thus our DBA Course is more than enough for you to make your profession in this field.

Stay connected to CRB Tech for more technical optimization and other updates and information.

Don't be shellfish...Digg thisBuffer this pageEmail this to someoneShare on FacebookShare on Google+Pin on PinterestShare on StumbleUponShare on LinkedInTweet about this on TwitterPrint this pageShare on RedditShare on Tumblr

Apache Spark Interview Questions and Answers

There are lots of candidates who are looking out for DBA jobs and for them this blog will help in providing some DBA interview questions to prepare and perform well in the interview and then make career in this field.

1) What is Apache Spark?

A flexible data processing framework which is easy to use is Spark and it is very fast. Cyclic data flow is supported with the help of advanced execution engine aiding data flow which is cyclic and in-memory computing. On Hadoop spark can run, independently or in the cloud and is very much capable of accessing diverse data sources including HDFS, HBase, Cassandra, and others.

2) Define RDD

Resilient Distribution Datasets is the full form for RDD and it is a fault tolerant collection of elements which are operational and they run parallel. It has a distributed and an immutable RDD which has a partitioned data. There are primarily two types of RDD:

Parallelized Collections: There is a parallel connection between RDDs which are in existence.

Hadoop datasets: In HDFS or other storage system, functions are performed on each file record.

3) Discuss the working of the Spark Engine

For the purpose of distributing, scheduling and monitoring Spark Engine are responsible across the cluster.

4) Explain Partitions

For the purpose of splitting or logical division of data similar to MapReduce, partitioning is done but it is quite smaller. For deriving logical units of data, partitioning process is done for speeding up of data in the processing process. Partitioned RDD is present in every Spark.

5) RDD Support and operations

Actions

Transformations

6) Explain transformations in Spark

On RDD Transformations are applied functions which lead to another RDD. Until an action happens there is no execution done. For the purpose of transformations map() and filter() are used and where map () applies the functions passed to it on each element of RDD and results in another RDD. The filter makes a new RDD by choosing elements from current RDD that pass function arguments.

7) Explain Actions

For bringing back the data from RDD to the local machine action is the key. For all previously created transformations an action’s execution is the result. Reduce() is an action leading to the functions passed again and again till one value is left. From RDD to local node take() action takes all the values.

8) Define SparkCore functions

Various significant functions like memory management, monitoring jobs, fault-tolerance and job scheduling leads to interaction with the storage data are some of the works done by the Spark Core which serves as the base engine.

Join the DBA course to know more about the basic interview questions that may need to face while attending an interview.

Don't be shellfish...Digg thisBuffer this pageEmail this to someoneShare on FacebookShare on Google+Pin on PinterestShare on StumbleUponShare on LinkedInTweet about this on TwitterPrint this pageShare on RedditShare on Tumblr

Memcached And It’s Importance

Memcached distributed caching solution is very much enough for you if you want to develop a high-performance large scale web application. No doubt it is definitely a popular distributed caching system.

In the year 2003, it was created by Brad Fitzpatrick and there are many applications like PHP where they are heavily used.

Working of Memcached :

Sharding of the keys is the basis for Memcached distributed caching architecture. In a dedicated shard, each key is stored that is supported by one or more machines.

For scaling better and caching bulk data is supported by this approach. RAM limit is the maximum set up that a single machine can cache. There are lots of machines added to your system and it will really cache bulk data in the case of Memcached.

For storage and retrieval of the offered key without the knowledge of user about the actual storage is assured by the system.

Popularity Behind Memcache :

There are lots of web applications which is famous in Memcache. Here are few key benefits of using a distributed caching solution called Memcached.

  • Since there is a reduction in IO there is a much faster application and most of the data is served from RAM.

  • Better usage of RAM- There are multiple servers which has lots of RAM left unused and thus you can easily find the machines as nodes to a Memcached system and just use it to the core.

  • Instead of a scale-up application can be scaled out.

Usage of Memcached :

It is a famous library which uses thousands of apps and is very much popular. Here are few popular names that use Memcached.

  • Craiglist

  • Wikipedia

  • WordPress

  • Flickr

  • Apple

Things To Note About Memcached :

It is a very reliable solution but then there are certain things to note about it:

  • RAM storage: Because of the RAM storage of data this makes it much faster and it is very much easy to lose. There is no persistence of data with respect to a storage system. If you find power loss or server crash all the data will be lost in Memcached.

  • As it is frequently found in RAM you need to start the cache after every restart. Thus serving data to cache or storage must be known by the programmer.

  • The persistence and updating of data in various situations must be taken care of by the application developer as there is no persistence in any storage.

  • There are no support transactions done by Memcache and this needs to be a big consideration if you are using a cache transactional data.

  • For producing a lot of garbage in memory it can be CPU intensive.

For more information join the DBA Training Course to make your career in this field.

Stay connected to CRB Tech for more technical optimization and other updates and information.

Don't be shellfish...Digg thisBuffer this pageEmail this to someoneShare on FacebookShare on Google+Pin on PinterestShare on StumbleUponShare on LinkedInTweet about this on TwitterPrint this pageShare on RedditShare on Tumblr

Teradata

Define Teradata :

Do you know one of the most famous Relational Database Management System? It is called Teradata. For the purpose of constructing large-scale data warehousing applications, it is mainly suitable. The parallelism concept is achieved by the concept of Teradata. The company called Teradata developed it.

Evolution of Teradata :

Here is a list of achievements and progress of Teradata over the years:

  • 1979: Incorporation was done by Teradata.

  • 1984: First database computer DBC/1012 was released.

  • 1986: Teradata was named by Fortune magazine as Product of the year

  • 1999: Using Teradata with 130 Terabytes was considered as the largest database in the world.

  • 2002: With Partition Primary Index and Compression, Teradata V2R5 was released along with compression.

  • 2006: Master Data Management Solution is done by Launch of Teradata.

  • 2008: Active Data Warehousing was released by Teradata 13.0.

  • 2011: Advanced Analytics Space has been entered by Acquired Teradata Aster.

  • 2012: Version 14.0 of Teradata was introduced.

  • 2014: Version 15.0 of Teradata was introduced.

Features of Teradata:

Here are few features of Teradata:

No Sharing Architecture: Shared Nothing Architecture or No Sharing Architecture is the other name for Teradata architecture. The disks linked with AMPS, Teradata Nodes, and Access Module Processors (AMPs) work independently. There is no sharing done with others.

Parallelism that is unlimited: Massively Parallel Processing (MPP) Architecture is the basis for Teradata database system. The workload is divided evenly across the entire system by MPP architecture. Among its processes, the tasks are split by the Teradata system and run a parallel system for being sure of a completed task quickly.

Linear Scalability: There is a high scalability of Teradata systems and the scaling limit is up to 2058 Nodes. For instance, the capacity of the system can be doubled by doubling the number of AMPs.

Connectivity: Channel Attached systems are connected with Teradata like Mainframe or Network attached systems.

Mature Optimizer: One of the matured optimizers in the market is called Teradata optimizer. Since the starting, it can be designed in parallel. For each release, it has been refined.

SQL: For interacting with the data stored in tables, Teradata supports industry-standard SQL. Apart from that, it offers its own extension.

Robust Utilities: Robust utilities are offered by Teradata for importing or exporting data from/to Teradata systems like FastLoad, FastExport, MultiLoad, and TPT.

Automatic Distribution: The data is distributed automatically with Teradata evenly to the disks and there is no need of any manpower over here.

Join the DBA Course to know more about Teradata effectively.

Stay connected to CRB Tech for more technical optimization and other updates and information.

Don't be shellfish...Digg thisBuffer this pageEmail this to someoneShare on FacebookShare on Google+Pin on PinterestShare on StumbleUponShare on LinkedInTweet about this on TwitterPrint this pageShare on RedditShare on Tumblr

Important Things About Hadoop and Apache Spark

In the big data space, they are seen as competitors but the main feeling is that they are better together with growing consensus. If you go through an reads about big data you will get to know about the presence of Apache Spark and Hadoop. Here are their brief overlook and comparison.

1) There are lots of things they do:

The two big data frameworks are Hadoop and Apache Spark but there is no same purpose that is actually served. Across various nodes, it shares massive data collections inside a cluster of commodity servers that you need not buy and handle commodity servers and it means you don’t need to buy or maintain expensive custom hardware. A data processing tool in spark, on the other hand, works on distributed data collections and it doesn’t do shared storage.

2) They both are independent:

There is not only just a storage component in Hadoop called Hadoop Distributed File System as you can also find MapReduce a processing component and there is no need of a spark to get it done. It is possible to use Spark without the need for Hadoop. There is no own file management system in Spark and it needs to be combined with one apart from that if HDFS is of no use then you can find another cloud-based data platform and the Spark was designed for Hadoop, however, there are lots of people who agree that they work better together.

3) Spark is faster:

MapReduce is generally slower than Spark because the latter’s way of processing the data. The operation of MapReduce is done in steps throughout the data in one fell swoop. This is how the MapReduce workflow looks like, “ the cluster reads the data work an operation and the clusters are written with results and the cluster reads the updated data and the next operation is performed, produce next result to the cluster etc. In memory and in near real-time the Spark completes the full data analytics and the data from the cluster is read for working all requisite analytic workings. Thus Spark is 10 times faster than MapReduce and 100 times faster than in-memory analytics.

4) Spark’s speed is not required for you:

If your data operations and reporting requirements are mostly static and you can stay for batch mode processing then your MapReduce processing would be just fine. On streaming data, if you need to do analytics like from sensors on a factory floor or possess applications needing multiple operations, then you need to go with Spark. For instance, there are lots of operations required and common applications for Spark are a real time marketing campaign, along with online product recommendations, analytics, machine log monitoring etc.

Thus join DBA Course to know more about Hadoop and Apache Spark.

Stay connected to CRB Tech for more technical optimization and other updates and information.

Don't be shellfish...Digg thisBuffer this pageEmail this to someoneShare on FacebookShare on Google+Pin on PinterestShare on StumbleUponShare on LinkedInTweet about this on TwitterPrint this pageShare on RedditShare on Tumblr

APACHE IGNITE

An in-memory computing platform called as Apache Ignite can be inputed between a user’s application layer and data layer. From the current disk-based storage layer into RAM, enhancing six orders of magnitude and performance.

For handling peta bytes of data to which the in-memory data capacity can be easily scaled. Both the ACID transactions and SQL queries are further supported. Scale, performance, and comprehensive capabilities far above and beyond what traditional in memory databases, data grids are offered by Ignite.

For ripping and replacing their existing databases there is no need of users for Apache Ignite. It works with NoSQL, RDBMS, and Hadoop data stores. Fast analytics, real-time streaming, high performance enabling are some of the Apache Ignite highlights. A massively parallel architecture, used a shared, affordable commodity for current or new applications power. On premises, Apache Ignite can be run and on cloud platforms like Microsoft Azure, and AWS are in a hybrid environment.

Key Features

There is an in-memory data grid for handling distributed in-memory data management and it is contained in Apache Ignite. You will find object based, ACID transactional, failover, in-memory key value store, etc. On the contrary to traditional database management systems, primary storage mechanism are used by the Apache Ignite.

Instead of disk if you are using the memory then it increase its speed upto 1 million times faster than traditional databases.

Free-Form ANSI SQL-99, compliant requires with actually no limitations is supported by Apache Ingite. There are use of any SQL function, grouping, or aggregation, and it aids distributed, non co-located SQL joins and cross cache joins. The field queries concept of backing up to reduce the serialization and network overhead is also supported by Ignite. A computer grid for enabling parallel in memory processing is included in the Apache Ignite. There are other CPU-intensive or other resource-intensive tasks like traditional MPP, HPC, fork-join, and Map Reduce processing. For Standard Java Executor Service asynchronous processing is backed up by Apache.

Join the DBA course to make your career in this field.

Stay connected to CRB Tech for more technical optimization and other updates and information.

Don't be shellfish...Digg thisBuffer this pageEmail this to someoneShare on FacebookShare on Google+Pin on PinterestShare on StumbleUponShare on LinkedInTweet about this on TwitterPrint this pageShare on RedditShare on Tumblr

Core Security Areas in MongoDB

There are new innovations in MongoDB security. There are lots of news and stories revealing how hackers use seizing MongoDB databases and ransoming data for bitcoins.

There is always a worry about security and if you run database, networks, applications, is always a prime issue. There are lots of companies to open source software and the reason is MongoDB for storing significant enterprise data, security becomes an important question. With respect to your business, you also have lots of government or business network security regulatory standards observe.

The safe thing to use over here is MongoDB and if you know your searching and the ways to configure it then it will be the best.

The main thing to refer here is how do people go wrong with MongoDb security?

You can find lots of areas with MongoDB users and security like:

Using the default ports

No immediate authentication enabling.

Providing broader access while using authentication.

For forcing password rotations, not using LDAP.

SSL usage is not forced on the databases.

Dont limit your database access to known network devices.

Five core security areas in MongoDB

Authentication: In your company directory, LDAP Authentication centralizes items.

Authorization: The database offers that the authorization defines role-based access controls using the database provisions.

Encryption: At-Rest and In-Transit, are the broken encryptions. For securing important data encryption is used.

Auditing: Who did what in the database is the ability of auditing.

Governance: Document validation is referred as governance and testing for sensitive data ( like account number, password, Social security number, or birth date).

LDAP Authentication

There are built in user roles for MongoDB and turns off automatically. There are items like password complexity, age based rotations etc and the identification and centralization of user roles versus service functions.

Hopefully LDAP can be used to fill lots gaps. There are lots of connectors to use the Windows Active Directory.

Note: It is available in LDAP support in MongoDB Enterprise. There is no community version. There are other open source versions of MongoDB like Percona Server for MongoDB.

Custom roles

MongoDB has a core called Role based access control (RBAC). In the version of 2.6 MongoDB there are some built in roles available. You can set new limitations as to what can or cannot be accessed Five core security areas in MongoDB by the users.

For more information join the DBA course in pune to make your career in this field.

Stay connected to CRB Tech for more technical optimization and other updates and information.

Don't be shellfish...Digg thisBuffer this pageEmail this to someoneShare on FacebookShare on Google+Pin on PinterestShare on StumbleUponShare on LinkedInTweet about this on TwitterPrint this pageShare on RedditShare on Tumblr