'It is difficult to ask other people at work for help.': 'PS5', 'It is safe to take a risk at work.': 'PS4',
'People at work sometimes reject others for being different.': 'PS3', 'People at work are able to bring up problems and tough issues.': 'PS2', 'If I make a mistake at work, it is held against me.': 'PS1', Administrative science quarterly, 44(2), 350-383. Psychological safety and learning behavior in work teams. The 7-item Psychological Safety scale, here denoted by the PS prefix, was developed by Edmondson (1999). We’ll rename the items based on the scale to which they belong, in the order listed on the source articles. Scale item columnsįirst we’ll rename the items belonging two psychology scales used: one measuring psychological safety at work and the other measuring general self-efficacy at work. Next we’ll rename the columns to be easier to work with. 'For each of the following statements, please indicate how much you agree or disagree based on your experience at workplaces in general (not in any specific job).'], 'US Congressional District', 'DMA Code', 'DMA Name', 'Weight', 'Country', 'Provider', 'US Census Region', 'US Census Division', Imports import pandas as pdįrom import qqplotĭf = pd.read_csv('pollfish-data.csv', encoding='latin1')ĭf = df.drop(['Time Started', 'Time Finished', 'Manufacturer', 'OS', 'Year Of Birth',
#Sns distplot rename x axis code
Let’s extract more relevant columns to another dataframe: columns = g_summary = g_search_results].copy() g_lumns = columns = if '_' in col else col for col in g_lumns ] g_lumns = columns g_summary.The code below implements the python code used to analyze this data. Like before, the output will be saved to a dataframe called g_search_results. Grid searching will also take a bit of time because we have 24 different combinations of hyperparameters to try. Once saved, let’s import it to Python: sample = pd.read_csv('IMDB Dataset.csv') print(f" g_search = GridSearchCV(estimator=pipe, param_grid=param_grid, cv=5, n_jobs=-1) g_search.fit(X_train, y_train) # Save results to a dataframe g_search_results = pd.DataFrame(g_search.cv_results_).sort_values(by='rank_test_score')
#Sns distplot rename x axis download
You can download the dataset here and save it in your working directory. Now, we are ready to import the packages: # Set random seed seed = 123 # Data manipulation/analysis import numpy as np import pandas as pd # Text preprocessing/analysis import re from rpus import stopwords from nltk.stem import WordNetLemmatizer from nltk.tokenize import RegexpTokenizer from import SentimentIntensityAnalyzer from textblob import TextBlob from scipy.sparse import hstack, csr_matrix from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.preprocessing import MinMaxScaler # Modelling from sklearn.model_selection import train_test_split, cross_validate, GridSearchCV, RandomizedSearchCV from sklearn.linear_model import LogisticRegression, SGDClassifier from sklearn.naive_bayes import MultinomialNB from trics import classification_report, confusion_matrix from sklearn.pipeline import Pipeline # Visualisation import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline sns.set(style="whitegrid", context='talk') 1. If you have already downloaded, running this will notify you so. Once you have nltk installed, please make sure you have downloaded ‘stopwords’, ‘wordnet’ and ‘vader_lexicon’ from nltk with the script below: import nltk nltk.download('stopwords') nltk.download('wordnet') nltk.download('vader_lexicon') Let’s make sure you have the following libraries installed before we start: ◼️ Data manipulation/analysis: numpy, pandas ◼️ Data partitioning: sklearn ◼️ Text preprocessing/analysis: nltk, textblob ◼️ Visualisation: matplotlib, seaborn I have tested the scripts in Python 3.7.1 in Jupyter Notebook. If you are new to Python, this is a good place to get started. This post assumes that the reader (? yes, you!) has access to and is familiar with Python including installing packages, defining functions and other basic tasks.