Photo by Andrew Petrischev on Unsplash

Which is? Open-source alternatives? How to implement? Challenges and more!

This publication is intended to shed light on the situation that “some” institutions have begun to understand and value their data as an internal product that will boost, qualify and distinguish their products concerning the market in the short term. This same company that, unfortunately, internally, each business area built its architecture, many times without a standard, without quality or basic sanitation and making the distribution in an archaic and slow way, full of “gambiaras” a veritable data slum.

In this chaotic situation, it is very common for business people to be somehow alienated, blaming situations of slowness in a…

Photo by Andrew Petrischev on Unsplash

Oque é? Alternativas de código aberto? Como implementar? Desafios e muito mais!

Essa publicação se destina a dar uma luz para a situação que “algumas” instituições que começaram a entender e valorizar seus dados como sendo um produto interno que impulsiona, qualifica e distingui seus produtos em relação ao mercado a curto prazo. Essas mesmas empresas que infelizmente, internamente, cada área de negócio construiu sua arquitetura muitas das vezes sem um padrão, sem uma qualidade ou saneamento básico e fazendo a distribuição de maneira arcaica e lenta, cheias de gambiaras, uma verdadeira favela de dados.

Nessa situação caótica é muito comum que as pessoas do negócio estejam de certa forma alienadas, colocando…

Alternatives to Jupyter Notebook for Python and more!

Jupyter emerged and gained respect for being an easy-to-install solution, in addition to bringing the proposed use to facilitate coding and visualization of the code, a form of interactive computing aimed at usability. With the success of Python, other market tools started to support this language and compete directly with Jupyter.

#Now that I spoke well about jupyter, let’s go to the competition… :)

Photo by Nicole Wolf on Unsplash

Boost your JupyterLab with these tips!

This publication is a list of extensions that can facilitate the use of the JupyterLab IDE; here are the tips:

# Variable Inspector

This extension shows the variables used and their values.

Photo by Perchek Industrie on Unsplash

Improving the performance of Apache Cassandra, Best practices, and a little more! :)

Cassandra is a NoSQL database developed to ensure rapid scalability and high availability of data, being open source and maintained mainly by the Apache Foundation and its community.
Its main features are:

  • “Decentralization”: all nodes have the same functionality.
  • “Resilience”: several nodes replicate data; it also supports replication by multiple data centers.
  • “Scalability”: adding new nodes to the cluster is fast and does not affect the system's performance; there are systems that use Cassandra with thousands of nodes today.

According to Apache, we can look at Cassandra as being:

“The Apache Cassandra database is the right choice when you need…

Photo by Perchek Industrie on Unsplash

Melhorando a performance do Apache Cassandra, Melhores práticas e um pouco mais! :)

O Cassandra é um bancos de dados NoSQL, desenvolvido para garantir rápida escalabilidade e alta disponibilidade dos dados, sendo de código aberto e mantido principalmente pela fundação Apache e sua comunidade.

Suas principais caraterísticas são:

  • “Decentralização”: todos os nodes possuem as mesma funcionalidades.
  • “Resiliência”: os dados são replicados por vários nodes, suporta também replicação por múltiplos datacenters.
  • “Escalabilidade”: adicionar novos nodes ao cluster é rápido e não afeta a performance do sistema, existem sistemas que utilizam o Cassandra com milhares de nodes atualmente.

Segundo a Apache podemos olhar para o Cassandra como sendo:

“O banco de dados Apache Cassandra é…

Photo by Fotis Fotopoulos on Unsplash

RStudio in the Docker container environment!!

Docker and containers

Each time we need to create segregated and resilient environments capable of supporting from small applications to distributed databases, making intelligent governance of the computational resources of infrastructure and network simultaneously that we scale solutions horizontally, depending on the need of the scenario covered.

As a solution to this dilemma, containers appeared, according to Microsoft, they are “similar to virtual machines, but they do not create an entire virtual operating system. Instead, Docker allows the application to use the same Linux kernel as the system on which it is running. ”

Docker is one of the main…

Photo by Possessed Photography on Unsplash

How to suit LGPD using MLOps, Data catalog, and more?

We are adapting to the General Data Protection Law (LGPD); this new law has as main objective to guarantee data privacy and reliability, but how is the data area adapted to this new reality? What are the strategies adopted? What about Artificial Intelligence (AI)?

These are some of the strategies that are happening in the data area:

  • Infinite Forms?
    This strategy aims to create one or more forms to manage who accesses and where the data and its sources are. …

Photo by Boba Jaglicic on Unsplash

Discover Apache Hive, its power, and more !! :)

#Big data

More and more, we have to deal with large volumes of data that are created and need to be used at an unbelievable Speed, having a huge variation, almost impossible for a human being to follow, to be concerned with its Veracity, and to be able to add value to the business in a way effective. (The 5 V of Big data).

To deal with this, the term “Big data” came up, and several solutions to deal with these problems in different scenarios, such as Apache Hive.


According to IBM, “Apache Hive is open source data warehouse…

Photo by Kelly Sikkema on Unsplash

Deploy and Run Jobs Spark on Scala in GCP is easy!!

This is a simple tutorial with examples of using Google Cloud to run Spark jobs done in Scala easily! :)

  1. Install the Java 8 JDK or Java 11 JDK

To check if Java is installed on your operating system, use the command below:

java -version

#Depending on the version of Java, this command can change … :)

If Java has not been installed yet, install from these links: Oracle Java 8, Oracle Java 11, or AdoptOpenJDK 8/11; always checking the compatibility between the versions of JDK and Scala following the guidelines of this link: JDK Compatibility.

2. Install Scala


Josue Luzardo Gebrim

I am sharing my opinion and what little I know of eventually here.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store