TangibleAI

How to Use Quizzes in your Chatbots to Educate Your Audience

2022-09-05T00:00:00-04:00

Social-sector organizations are turning to chatbots to meet their goals for learning initiatives. It’s challenging, though, to attract users to educational content. Think about how you might feel if you click on a link and get a 50-page document in a new browser tab. Like most users, you might close both tabs and move on. The user hasn’t received the message. Quizzes, though, are a great addition to a chatbot that delivers educational content.

Quizzes are already proven to get users’ attention. Back in 2016, Content Marketing Institute noted that 81% of marketers thought people paid more attention to interactive content, like quizzes, than static content. How much more? The 2019 State of Interactive Content Marketing report put that engagement rate at 2x higher. Those numbers aren’t surprising. Quizzes offer a brief, fun experience and even pique users’ curiosity. We usually think of quizzes as a way to show what someone learned, but they can also educate users about topics your organization values. We’ll look at 3 reasons that quizzes in chatbots can be a valuable tool to educate your audience.

1. Respond to User Preferences

Quizzes help users explore the key information a chatbot provides in the way they prefer. We often may think that users start with content and then move on to quizzes. When given a choice, though, users just as often prefer to engage with a quiz first instead of longer content. They want to see what they do and do not know. Then, they will review the content sections they want to know more about. Users may even end up taking a quiz several times because they want to get a better score or see what the bot says.

We developed a chatbot named Maya for Plan International Nepal to educate Nepali youth about the risks of human trafficking. Maya has 3 true/false quizzes related to 3 topics – trafficking, online safety, and safe migration for work. We’ve seen that about 26% of users choose an information module before a quiz, and 27% choose a quiz module first. Empowering users to choose how they want to receive information boosts learning.

By taking into account user preferences, your organization can attract a more diverse audience. That means more visibility for your organization and its ideas.

2. Increase Engagement

Users stick with quizzes longer than less interactive types of content. Chatbots can serve various types of content like video and images. This content, though, often doesn’t take advantage of the interactivity chatbots can bring. Users don’t have to stay in a chatbot longer than they want to. Even if they access a document or resource, users may leave before even reading it.

Quizzes give users a reason to stay – to see their score. They know the experience will be pretty short. The brevity, interactivity, and purposefulness of quizzes encourage users to stay longer. In Maya, most users finish an entire quiz once they start. Quizzes in Maya generally see a 25% drop-off, meaning that 75% of users finish the quiz. That engagement is much harder to achieve in sections that only deliver information.

Users engage with quizzes, and that interest can benefit your organization. They may want to know more about a topic or about your organization. A well-placed CTA after the quiz could take them to a newsletter subscription or a donation page. In short, users may explore more of what your organization has to offer.

3. Learn What People Are Learning

The quiz responses you get can show what your audience is learning through the chatbot. A user accessing content does not mean they learned the information well. Quizzes can help you see which topics, or even specific parts of topics, people are less familiar with. True/False quizzes provide an easy means of getting averages. Short answer responses can show users’ thoughts, such as what lesson they valued most. Your organization can use quiz responses to improve the chatbot and its content. But, keep in mind that users explore quizzes by intentionally giving wrong answers. The data quizzes provide can be useful, but it needs to be analyzed with care.

Quizzes aren’t just for schools, they can be a valuable tool in your chatbot. A well-designed quiz can give users key information in an engaging way. They can give your organization insights into what your audience is learning. Quizzes work well with other content that your chatbot might serve. We’ve put together some tips for making effective quizzes to educate your audience.

10 Tips for Making Quality Chatbot Quizzes

Limit quizzes to 5-10 questions
Make the answer choices clear
Give feedback after each question instead of at the end
Question + Answer + Feedback should fit in the chatbot window
Set expectations with directions and explanations before the quiz
Pick questions that target the most important information users should know
Tie quiz items to your organization’s objectives
Add prompts from the quiz to other content (or vice versa)
Give users the option to share their results
Consider having CTAs like email subscriptions or signups after the quiz

We can help you leverage chatbots to educate your audience about important issues. We make chatbots that not only use quizzes but also other engaging content to spread your message.

Ready to create your own chatbots?

Let's talk

Improve your chatbot by analyzing user messages in 5 steps (Part 2)

2022-07-21T00:00:00-04:00

This is Part 2 in the post series about improving your chatbot by analyzing user messages. (You can read Part 1 here)

Social impact organizations need to pay attention to what users say when communicating with their chatbot. Users often communicate with natural language responses, also known as “free-text” or “open” responses. Organizations may actually elicit open responses to get feedback or for filling out forms. Additionally, users may try to chat with a bot using natural language responses, whether the bot prompts such messages or not. All of these responses provide nonprofits and other organizations an opportunity to gain insight into the most critical aspects of their chatbot.

In part 1 of this post series, we described a process to systematically collect and organize free-text user responses. Once that’s done, though, you need to know what to look for to get the most value from them. We’ll look at 5 key areas to focus on when analyzing natural language responses for insights to improve your chatbot.

Questions / Queries

Even with menu-driven chatbots, people will still have questions that aren’t addressed or aren’t easy to find. Questions provide insights into what users care about and want to know. In educational bots we have worked on for nonprofit organizations, we noticed users looked for information about very relevant topics that were not initially considered for the project. We’ve used these insights to add new features and develop an FAQ section for the bots.

In Maya, a chatbot we developed that educates Nepali youth about the risks of human trafficking, the initial scope of the project focused on basic information about what trafficking was and how to stay safe. Users asked many additional questions, such as statistics on trafficking and how domestic violence is related to trafficking. Adding these insights not only helped engagement with the bot but also improved the depth of learning Maya could assist with.

Navigation

Navigation is critical to a chatbot’s success. Analyzing messages shows where users might be trying to switch conversation paths or return to certain menus. Users give up pretty quickly when they can’t find what they are looking for. Navigation issues need to be fixed, but it can be hard to understand why users aren’t reaching certain content.

Even simple menu-based bots are not immune to navigation errors. In one bot we worked on, we noticed that users typing “C” (with quotation marks) to select option C caused a fallback response, because the system was trained to recognize option C without quotes. It would have been difficult to spot this problem without analyzing natural language responses.

Small Talk

A chatbot needs to be able to respond to small talk messages. These messages may not target relevant information, but providing a good response improves the user’s conversational experience and increases the chance of the user sticking around. Users bring their experience talking to people to their interactions with chatbots. So they expect human-like reactions, such as greetings, introductions, and closings. When working on bots at Tangible AI, we have a list of common intents that chatbots should be able to respond to, such as “who are you”, “continue”, and “thank you”.

Aside from basic small talk, we have also noticed that users across our bots are concerned about their privacy. They ask “Is my conversation private” or “Can I talk with you privately”. Having a poor response to such an important question will certainly make a user uncomfortable. You need to help users understand the context of their conversation with a chatbot.

Feedback

Users might also provide feedback on their experience with the chatbot. The feedback might come from a survey within the chatbot itself. It’s also possible a user would just give some feedback response to your chatbot unexpectedly.

These messages could be positive or negative. With Maya, users left encouraging messages that certain parts of the bot were great, particularly that they learned a lot. They also left messages critiquing the activities the bot provided, or expressing their confusion. Both complimentary and negative feedback provided us with a better understanding of how the experience was working. We learned how to improve our activities based on the feedback. Also, we learned why other components of the bot worked well, which let us improve our development going forward.

Out of Scope/Spam

Some messages users send will not be relevant to the purpose of your chatbot. Sometimes, users may not understand what the chatbot’s for. They may ask for information or functions that aren’t available, like providing local daily news. Of course, users may just want to test out the chatbot or play around with it. It’s unfortunately common that chatbots face harassment. In most cases, there is not much one can do to reduce spam messages, especially for a bot that targets a broader audience.

In order to get the most out of their chatbots, nonprofits need to work with natural language responses that their chatbot doesn’t recognize. While it can be tempting to focus on the successful exchanges a chatbot has, messages that a chatbot doesn’t know how to handle can offer insights into what’s not working. More importantly, free responses are direct communication from users, and the messages provide a rich source of information that can help you plan how to make your bot more effective for your audience. Implementing a process to deal with natural language is important for the same reason that the bot exists. With a clear process, these messages can provide insights into some of the most critical aspects of a chatbot that tell how well it is accomplishing its purpose.

Improve your chatbot by analyzing user messages in 5 steps (Part 1)

2022-06-02T00:00:00-04:00

Interest in and use of chatbots continue to grow among nonprofit organizations as they allow for providing services to and educating constituents at scale in an interactive, engaging way. But once you launch your chatbot and let it interact with users, you will notice that one of its most frequent answers to users is “I’m sorry, I don’t understand”.

This issue doesn't just affect complex bots, or even ones that have Natural Language Understanding (NLU) capabilities. Most social impact organizations use menu-based bots where users select buttons or enter numbers to navigate the chatbot content. Despite the limited spectrum of allowed interactions, users will often try to address the bot with free-text responses that use language the bot was not programmed to handle. These responses, which can comprise as much as 30% of the total interactions with the bot, will trigger a catch-all response (also called a “fallback” or “default” response). Repeatedly getting this response is a frustrating experience for the users, leading them to disengage before they are able to have any meaningful interaction with the chatbot.

Whether you’ve deployed a chatbot or are planning to adopt one, it’s important to develop a strategy for dealing with natural language responses that trigger the catchall in your chatbot. If you collect and analyze them systematically, you can discover quite a lot about your users and their interactions with your bot.

For this post, we’ll focus on a reliable process you can use to work with natural language responses. In a follow-up post, we’ll consider what insights you can get from the data.

How to handle the free-text chatbot responses: A step-by-step guide

You need a clear process to keep your users' messages organized. Our clients saw significant jumps in free responses both after launch and after advertising campaigns. A clear process helps you deal with this volume while keeping you focused on the insights you can gain. At Tangible AI, we use a 5-step process to work through these messages.

Step 1 - Extract

Gather the messages that triggered a catchall response. Natural language responses, basically the responses your users type to your chatbot, can remain hidden if your platform or your organization doesn’t have a way of collecting them. Systems have different capabilities and methods for handling these responses, so it’s important to read your platform’s documentation. Beyond just getting the data, it’s useful to find ways to automate this extraction process to save time and make it easier to repeat.

Step 2 - Cluster

Try to identify groups or clusters of similar queries. Look for common patterns in the messages, similar requests, and synonyms to understand how your users phrase their requests.

Step 3 - Develop Intent Model

Understand what users are trying to do. “Intents” refers to the goal the customer has in mind when typing in a question or a request. The user indicates through intents what they want the chatbot to do, or what information they would like to receive. Your users, whether they are the people your organization serves or your donors and supporters, may make similar requests with different vocabulary, so the synonyms or various sentences you clustered before will help identify these intents. Some messages may relate to existing intents. Or you might get ideas for new intents. During this stage, you need to consider what intents are the most valuable for your users and for your organization’s goals.

Sample intent model.

Step 4 - Label

With the messages clustered and the intents identified, you need to classify the messages. This step involves assigning labels to intents. You should have a clear naming scheme. The consistency in labels will help you keep your bot organized.

Step 5 - Analyze

With the messages organized and labeled, you can analyze the data to find insights into how to use the bot. We will talk extensively about the analysis in our next post. Some of the sample analysis you can carry out:

What are the most common questions the users ask?
How are the questions distributed between topics?
What percentage of users’ messages are spam? How many of the messages relate to navigation?

This 5-step process will help you work systematically with natural language responses. This kind of response may have triggered the catchall, or you may have designed your bot with open questions to elicit user feedback. Regardless, this process organizes these responses so that you can analyze them to get insights into what parts of the chatbot are working and what you could improve. In the next post, we will look at 5 specific insights you can focus your analysis on to find areas for improving your chatbot.

Choosing a Channel for Your Nonprofit Chatbot

2021-11-26T00:00:00-05:00

Nonprofit chatbots are moving into the mainstream as organizations are looking for ways to stay connected to constituents and donors, recruiting more people for the cause, and looking for better ways to live out their unique missions in the world. As your nonprofit considers launching your own chatbot, one of the first decisions is what channel you should use. After being asked this question several times by our clients, we decided to share what we have learned in this post.

Key Questions When Decided on a Channel for Your Nonprofit Chatbot

Finding the right channel can mean the difference before the success or failure of your chatbot. As a first step,ourclient find that it is helpful to answer the following questions. The answers to these questions can help guide to an optimal choice of your channel.

What are the most popular platforms in the countries where your target audience resides?
Fortunately, a quick web search will show you lists and maps of popular messaging apps listed by country. For example, Whatsapp is prevalend in most African and East Asian countries, while being much less popular in the US. You are likely to find a higher rate of adoption using a platform that already has a high use rate in the countries you are targeting.

Do you plan on engaging in repeated interactions with the chatbot?
As you look at features, it will be essential to consider whether you are expecting onetime communications via the nonprofit chatbot or whether it will be more for ongoing relationships.

Will your users be comfortable with the chatbot knowing personal details like name or email?
This is a critical question for some nonprofits. Consider an example: an international nonprofit launched a Facebook hatbot in an Asian country with semi-authoritarian regime to give people an opportunity to report corruption. However, they were not able to attract many users, as people weren't eager to share report corruption from their Facebook accounts, for fear of being identified.

Do you plan to engage users proactively?
Once people are on board with your nonprofit chatbot, will you want the ability to send updates or reminders over the same platform? As you’ll see, many messaging platforms put limits on proactive outreach to prevent spam.

How will users find the bot?

As you look at some of the channel options, you will see that for many of them, you will need a plan to grow your chatbot’s audience. This will either happen organically, because your users are accustomed on interact with you on that channel;or you’ll need to create a campaign to attract your users.

5 Channel Options – Pros and Cons

Let’s examine 5 different channels you can use for your chatbot. If you are looking for examples of chatbots for each channel – feel free to head to our social impact chatbot database and look for chatbots in your area of interest!

Chatbot on Your Website / Landing Page

The most obvious place for your nonprofit chatbot might be right on your organization's web page or landing page. In many cases, this will offer the most significant amount of exposure and more options.

Benefits:

Accessibility – The chatbot is accessible to everyone with web access and doesn't require a login or personal information to use.
Exposure – The feature can benefit from your existing website traffic. If you already have an established website audience, you can leverage is as a source of users for your chatbot.
Interaction with Existing Content – On your website, your nonprofit chatbot can pull information and content that already exists on your site. It can be served up to users in answer to their inquiries.
Interface Flexibility – Web-based chatbots often offer more features in the user interface, enabling functionality like polls, checkboxes, and other elements.

Drawbacks:

Computer or Smartphone Required – Users with older "feature phones" will not be able to access the chatbot.
No Re-Engagement – If users don't need to log in or share personal information, there is no way to contact them again once they have clicked away.

Bottom Line: An existing (and popular) website or landing page can be a good placement for a nonprofit chatbot when the priority is one-time interactions with wide accessibility and/or the ability to remain anonymous. However, limitations around re-engagement may make this less effective for some use cases.

SMS – Text Messaging Chatbots

SMS may currently be the most available communications platform in the world. This
will undoubtedly help with accessibility, but it comes with some drawbacks.

Benefits:

Accessibility – Perhaps the most accessible platform available even to feature phone users!
No Internet Connectivity Needed – Since internet access is unnecessary, even phone users without data plans can access the nonprofit chatbot.
Repeated Engagement – It is easy to stay and contact with users, and it is not limited by the platform.
Proactive – You can initiate conversations with any user if you have their phone numbers.

Drawbacks:

Costs Increasing with Scale – when using a programmable SMS provider, such as Twilio, you’re more likely to pay per message. For large audiences, this can quickly get expensive.
Lack of International Flexibility – International organizations may face the requirement of maintaining several numbers and accounts for the same bot.
SMS Apathy – In some countries, the abundant SMS spam made people suspicious of and even annoyed by more text messages. Turn.io has showed that switching from SMS to Whatsapp can increase the engagement by as much as 16x.

Bottom Line: SMS can be a good choice for a nonprofit chatbot in underdeveloped regions with limited internet connectivity, or targeting an audience with limited smartphone adoption. It can also be a good solution for US-based projects where the audience has direct connection to your organization, like college students.

WhatsApp for Nonprofit Chatbots

This cross-platform messaging app is hugely popular around the world and offers some unique features. However, it does have some drawbacks that make it less than perfect for some applications.

Benefits:

Largest User Base – WhatsApp was the second-most downloaded app in 2020.mFor nonprofits doing work in Africa, it is important to note that it is particularly popular there. There are also a variety of knock-off versions popular on the continent.
End-To-End Encryption – For situations where privacy is essential, like when dealing with and health-related communication, this type of security is critical.
Universal Acceptance – More users are already familiar with WhatsApp and may already have an account.

Drawbacks:

Severely Limited Follow-up Outreach – Follow-up messages are allowed but need to follow a template that is pre-approved by WhatsApp.
Lack of Interactive Capabilities – Even after all the time and effort for setup, the feature set is limited. Buttons were introduced into Whatsapp only recently and quick replies, galleries, and other options are still missing.
Higher Pricetag – whether you’re accessing Whatsapp through services like Twilio that charges per message, or through a platform like Turn.io whose price tiers depend on monthly users, it’s generally a bit more expensive than other
channels.
More regulation – Whatsapp is more rigorous in supervising chatbot operators on the network. For example, it’s enforcing the policy that requires the chatbot to have clear escalation path.

Bottom Line: Due to its popularity and security, WhatsApp can be an excellent option for engaging large audiences but only for nonprofits with the time and resources to invest in their chatbot project.

Facebook Messenger Chatbots

Might the most popular social network in the world be the right home for your nonprofit chatbot? Facebook provides options loaded with features, but changing rules can complicate things.

Benefits:

Popularity – Facebook is the most popular social network in the world. Facebook's Messenger app, with more than 2 billion users, is currently the top downloaded app. In south-east Asia, Facebook is synonymous with the internet.
Interactivity – being the first platform to add chatbot capability, Facebook has continued to develop new features to make the service more interactive. Featuring the most advanced API, Messenger offers the ability to interact with users using quick replies, galleries, and other valuable features.
Connection to the Facebook Advertising mechanisms – When using Facebook Messenger for your nonprofit chatbot, you can take advantage of the integration with ads and comments. That means your bot can automatically reach out to a new user after clicking on an ad or commenting on a post.
Specialists and tools ecosystem – Since Facebook was the first to offer chatbot capability, it has developed an extensive network of specialists and tools to help launch and grow your nonprofit chatbot.

Drawbacks:

Changing Rules – Facebook's policies can change with little to no notice. In the past two years, Facebook has rolled out two significant policy changes, both of which had a drastic impact on chatbot capabilities. Those who depend on the service must stay up to date on the latest policy changes.
Ethical Considerations – following recent investigations by WSJ, users are abandoning the platform for ethical considerations, especially in developed countries.

Bottom Line: For nonprofits with an existing Facebook page and strategies, Facebook chatbots can be a great way to connect to existing content and take advantage of Facebook's welldeveloped social sharing infrastructure.

Telegram for Nonprofit Chatbots

Well known as a secure platform for communication, Telegram shows promise as a platform for a nonprofit chatbot. However, its lack of popularity in most countries may make adoption more difficult.

Benefits:

Interactivity – Like Facebook, Telegram offers some excellent interactive features like quick replies, buttons, and galleries. It also provides some more advanced features not found anywhere else.
Re-Engagement –Telegram has no limitations on pro-active re-engagement.
Available in Groups – With Telegram, your chatbot can be a part of groups and work as a community helper

Drawbacks:

Lack of Popularity – Since Telegram doesn't enjoy the same level of adoption as other options, there may be friction in getting users to install and register for an additional app.
Security – Although known for security, with Telegram, the chatbot conversation with the user is actually not end-to-end encrypted. This may make it less than optimal for healthcare or other uses where privacy is critical.

Bottom Line: This can be a great option as a nonprofit chatbot for organizations that already have users on the Telegram app, or organizations that are ready to invest in Telegram adoption for the sake of the project. Its lack of limitations on re-engagement can make it an excellent choice when that is a priority.

Other Channels

This is, of course, not a complete list. Depending on the country or countries where you may be launching your nonprofit chatbot, there may be other effective options. Forexample, Viber is quite popular in many countries, and some countries like Japan and Vietnam have apps that are exclusively dominant in the country. So, in addition to spending time answering your own key questions, it is worth listening to your potential users to ensure that your platform of choice is a good fit.

Can I Have a Nonprofit Chatbot on More than One Channel?

Yes! Nothing is limiting your nonprofit to one chatbot on one channel. You can have multiple bots on multiple channels and even allow users to switch from one to another. Since the rules Facebook issued limit proactive re-engagement, some nonprofits have been combining Facebook Messenger with SMS. This way, they can re-engage users and, if helpful, send them back to the Facebook Messenger bot.

Liked this blog post? Want to know more about our chatbot services? Talk to us about your use case!

Training a Python to Explore Holes in Dark Patterns

2021-01-24T00:00:00-05:00

Data Science students are always looking for new and interesting datasets to train machine learning models. There's tons of public data out there. Unfortunately, in the US, many of our "public" datasets are difficult to access. The most interesting data is hidden behind dark patterns on corporate and government websites.

Here you'll see how to use Pandas to easily pull down a lot of data from prosocial websites like wikipedia. Then you'll learn a little BeautifulSoup to scrape out that sneaky data that hides behind dark patterns.

`pandas.read_html`

If you build web pages with tables in them, they become accessible to anybody who knows how to use Pandas, like this Wikipedia page:

>>> import pandas as pd
>>> base_url = 'https://en.wikipedia.org'
>>> page_title = 'demographics of the world'
>>> page_url = f'{base_url}/wiki/{page_title.replace(" ", "_")}'
>>> tables = pd.read_html(page_url)
>>> len(tables)
25

Then you can easily find the interesting tables and calculate some statistics:

>>> for df in tables:
...     if len(df) > 10 and len(df.describe().columns) > 1:
...         print('='*70)
...         print(df.describe(include='all'))
...         print('='*70)
...         print()

Here's one of those tables of descriptive statistics:

================================================================================================================================================================
                 Year           0        1000        1500        1600        1700        1820        1870        1913        1950        1973        1998
count              17   17.000000   17.000000   17.000000   17.000000   17.000000   17.000000   17.000000   17.000000   17.000000   17.000000   17.000000
unique             17         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN
top     United States         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN
freq                1         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN         NaN
mean              NaN   17.158824   16.752941   16.829412   16.858824   16.788235   17.064706   17.064706   17.064706   16.911765   16.905882   16.758824
std               NaN   28.360008   26.822684   26.260849   26.968131   26.524938   27.237197   25.845767   24.935666   24.582257   24.890397   25.281887
min               NaN    0.200000    0.200000    0.200000    0.100000    0.100000    0.100000    0.500000    0.800000    0.900000    1.000000    0.900000
25%               NaN    1.300000    2.400000    2.300000    1.100000    1.300000    1.400000    3.100000    4.400000    5.400000    5.400000    4.600000
50%               NaN    2.400000    4.200000    4.000000    3.700000    4.500000    5.300000    7.000000    7.000000    7.100000    7.900000    6.900000
75%               NaN   15.900000   15.400000   20.100000   20.000000   21.000000   20.100000   19.900000   17.000000   15.500000   17.300000   16.500000
max               NaN  100.000000  100.000000  100.000000  100.000000  100.000000  100.000000  100.000000  100.000000  100.000000  100.000000  100.000000
================================================================================================================================================================

Dark Patterns

What's a dark pattern? It's any UX that prevents people from getting things done without manipulation and distraction and lock-in. For a Data Scientist a dark pattern prevents them from accessing data.

When the Internet was new, and only teenagers and geeks knew how to use it, public officials could be forgiven for "publishing" data in PDFs or proprietary spreadsheets and databases. But we live in an era where elected officials responsible for securing voter registration data have the skill to deploy dark pattern websites that support their political agenda. And the skill to do this sort of sophisticated technical work is not limited to advanced, stable democracies like the United States. Officials in charge of data in most developing countries are also deploying sophisticated web applications that breach the public trust. Do a Duck search for "Brian Kemp suppression" if you want to learn more. He was so adept at managing his IT department, he successfully made voter registration data accessible only to his supporters and campaign managers. And using predictive analytics on this data, he was able to delete the voter registrations for those that would likely vote against him in his campaign for Governor.

Illuminating the Dark

So I'll show you how easy it is to process data from prosocial public data sources like Wikipedia. And then I'll show you the problem with some dark patterns on the web. Some are intentional and some are not, but we'll help you illuminate the data you want and scrape it.

There are no options for the pd.read_html function that do what you want. So when I tried to get a list of business names from the California Department of State website, I get everything except the name when Pandas automatically parses the HTML:

>>> import pandas as pd
>>> bizname = 'poss'
>>> url = f'https://businesssearch.sos.ca.gov/CBS/SearchResults?filing=&SearchType=CORP&SearchCriteria={bizname}&SearchSubType=Begins'
>>> df = pd.read_html(url)[0]
>>> df

    Entity Number Registration Date         Status                                        Entity Name Jurisdiction      Agent for Service of Process
0      C2645412        04 / 02 / 2004         ACTIVE  View details for entity number 02645412  POSSU...      GEORGIA   ERESIDENTAGENT, INC. (C2702827)
1      C0786330        09 / 22 / 1976      DISSOLVED  View details for entity number 00786330  POSSU...   CALIFORNIA                        I. HALPERN
2      C2334141        03 / 01 / 2001  FTB SUSPENDED  View details for entity number 02334141  POSSU...   CALIFORNIA                   CLAIR G BURRILL
3      C0658630        11 / 08 / 1972  FTB SUSPENDED  View details for entity number 00658630  POSSU...   CALIFORNIA                               NaN
4      C1713121        09 / 23 / 1992  FTB SUSPENDED  View details for entity number 01713121  POSSU...   CALIFORNIA                LAWRENCE J. TURNER
5      C1207820        08 / 05 / 1983      DISSOLVED  View details for entity number 01207820  POSSU...   CALIFORNIA                          R L CARL
6      C3921531        06 / 27 / 2016         ACTIVE  View details for entity number 03921531  POSSU...   CALIFORNIA  REGISTERED AGENTS INC(C3365816)

The website hides business names behind a button. But you can use requests to download the raw html. Then you can use bs4 to extract the raw HTML table as well as any particular row(< tr >) or cell(< td >) that you want.

First lets see how public APIs and the semantic web are supposed to work. Say I read a great SciFi novel, Three Body Problem and wanted to find other books that, like it, won the Hugo Award for best novel. This is how you search for something on wikipedia:

>>> import requests
>>> base_url = 'https://en.wikipedia.org'
>>> search_text = 'hugo award best novel liu'
>>> search_results = requests.get(
...     'https://en.wikipedia.org/w/index.php',
...     {'search': search_text},
... )
>>> search_results

<Response [200]>

Now we can programmatically find the page with the Hugo Awards using BeautifulSoup4. Don't try to install BeautifulSoup without tacking on that version 4 to the end. Otherwise you'll get some confusing error messages. And the import name is bs4, not beautifulsoup. The .find() method finds the first element in a BeautifulSoup object. So if you want to walk through the list of search result, use .findall().

You only need the first search result for this carefully crafted search; ):

>>> import bs4
>>> soup = bs4.BeautifulSoup(search_results.text)
>>> soup.find('div', {'class': 'searchresults'})
>>> soup = (soup.find('div', {'class': 'searchresults'}) or soup).find('ul')
>>> hugo_url = (soup.find('li') or soup).find('a', href = True).get('href')
>>> hugo_url

'/wiki/Hugo_Award_for_Best_Novel'

So now we can join the wikipedia path with the base_url to get to the page containing the data table we're looking for. And we can use Pandas to deal download and parse it directly, without any fancy BeaufulSouping.

Some of this code is on stack overflow in the answer to"Pandas read_html to return raw HTML".

>>> soup = bs4.BeautifulSoup(requests.get(url).text)
>>> table = soup.find('table').findAll('tr')
>>> names = []
... for row in table:
...     names.append(getattr(row.find('button'), 'contents', [''])[0].strip())
>>> names[-7:]

['POSSUM FILMS, INC',
 'POSSUM INC.',
 'POSSUM MEDIA, INC.',
 'POSSUM POINT PRODUCTIONS, INC.',
 'POSSUM PRODUCTIONS, INC.',
 'POSSUM-BILITY EXPRESS, INCORPORATED',
 'POSSUMS WELCOME']

Now you can replace that useless column with the correct Button Text, the names of the businesses we're interested in. You need to ignore the first row in the HTML table, because it contains the header "Entity Name" and does not have a button tag:

>>> df['Entity Name'] = names[1:]
>>> df.tail()

    Entity Number Registration Date         Status                  Entity Name Jurisdiction Agent for Service of Process
96       C2334141        03/01/2001  FTB SUSPENDED           POSSUM MEDIA, INC.   CALIFORNIA              CLAIR G BURRILL
97       C0658630        11/08/1972  FTB SUSPENDED  POSSUM POINT PRODUCTIONS...   CALIFORNIA                          NaN
98       C1713121        09/23/1992  FTB SUSPENDED     POSSUM PRODUCTIONS, INC.   CALIFORNIA           LAWRENCE J. TURNER
99       C1207820        08/05/1983      DISSOLVED  POSSUM-BILITY EXPRESS, I...   CALIFORNIA                     R L CARL
100      C3921531        06/27/2016         ACTIVE              POSSUMS WELCOME   CALIFORNIA  REGISTERED AGENTS INC (C...

Resources

If you're working on an NLP problem, you can get data from Wikipedia the propper way... with a database dump: TDS Post on working with Wikipedia data dumps

Hacker Public Radio is awesome! I'm going to try to record my first podcast today, based on this blog post. I'll share these ideas for scraping public data out through the holes in dark patterns with Python (Pandas, Beautiful Soup). It'll be good practice for the monthly meetup
San Diego Python User Group.

Accelerate Your Creativity with Automation

2020-09-11T00:00:00-04:00

If you want to have more brain cycles for fun, creative stuff, you can automate the boring stuff.
Though you can't really call it a "cognitive assistant", it will definitely help you think better.
Automating the boring stuff was the secret to the rise of powerhouse startups like GitHub, GitLab, Puppet, and even Google (in the early days).
Wildly efficient companies can generate millions in profit per employee.
Companies that don't get it find themselves stuggling to exceed 100k of revenue per employee.

If you automate the drudgery, you will free up brain cycles for fun, creative stuff... like finding more things to automate.

At Tangible AI, our simplest, most popular automation has been the workon command. John, Olesya, and I use it a lot nearly every day as we switch back and forth between projects.
And we share it with our interns as part of the onboarding process.

Teaser: I've added a mind hack at the end of the post. It works well with the workon command to rev up your creativity.

`workon`

The virtualenvwrapper python package includes a command called workon.
And others have created a package called, appropriately Workon

But I use conda rather than virtualenv to organize my python environments.
So I wrote a hacky shell script to take care of this.

John and I are working on a python version of this.
But don't hold your breath...
If it ain't broke, we probably won't fix it.

Use case

Every time I open a new terminal, I find myself activating an environment and then switching to that directory.
Remembering the right environment name and directory path can be a problem.
I work on many different projects in a given day, and they change from day to day.

So I created a workon command that makes it easy for me to set up a project and come back to it later.
All workon does is find the paths to my conda environmnt and source code, for a particular project.
It offloads my brain from memorizing paths and names and spellings that aren't helping me be creative.
Plus it gets me started quickly.

Now all I do is say workon qary and I'm off and running.

Installation

Download the bash shell script.

wget https://gitlab.com/tangibleai/qary/-/raw/master/scripts/bash_functions.sh?inline=false

I put mine in my personal ~/bin/ directory where I keep all my automation scripts.
Then make sure that script is sourced as part of your bash login in .bashrc or .bash_profile:

mv bash_functions.sh ~/bin/
chmod +x bash_functions.sh
echo "source ~/bin/bash_functions.sh" >> ~/.bashrc

You may want to edit the bash_functions.sh script on line 38 to add paths where you keep your source code.
You might also want to add a git status command (or anything else you do a lot) below line 55.

That should do it!

Now, when you type workon qary it will get you all set up for some creative, productive coding on a chatbot to save the world!

Mind hack

Sometimes when you workon qary you end up staring at a blank screen or IDE.
It's hard to know where to start.
But if you do git status right after workon this highlights in green any files you've modified but not commited.
Then, if you remember to leave an "easter egg" before you switch to another project or shutdown for the day, that'll pop up at the beginning of your next session.
Right before I switch to a new project I'll add a TODO comment to my code, or even start a new line of code within incomplete syntax as a note to self.
Make sure you don't commit this reminder until you're in the zone and coding up some new automation to save the world.

Smart authors like Grant Ingersol often use this trick to seed their brain with creative ideas and avoid writer's block.
A writer will finish the day with an unfinished sentence or the first line of a dialog.
It's like a note to your future self.
This can supercharge your creativity the next morning by reminding you where you left off.

So when I'm about to switch projects I'll add an unfinished line of code or TODO, but won't git commit it.
That way it shows up when I do git status.
If your past self forgot to do this, and git status is empty, you can just do a git log --stat to reorient yourself.
And if that still doesn't work, sometimes when you build your project, your linters will flag any broken lines of code from the previous session.

Linting

You lint don't you?!!!
If not, I'll set you straight in a follow-up post.
Linters are a crucial bit of automation that all the strong developers I know of lean on heavily.
Thank you Steven and Aleck for indoctrinating me with this habit all those years ago.
PEP8 linting is mandatory for all interns and contractors at Tangible AI, as well as contributors to qary.

Which flavor of BERT should you use for your QA task?

2020-07-21T00:00:00-04:00

Making an intelligent chatbot has never been easier, thanks to the abundance of open source natural language processing libraries, curated datasets and the power of transfer learning. Building a basic question-answering functionality with Transformers library can be as simple as this:

`Input 1: Load Pretrained Transformer QA Model`

from transformers import pipeline

# Context: a snippet from a Wikipedia article about Stan Lee
context = """
    Stan Lee[1] (born Stanley Martin Lieber /ˈliːbər/; December 28, 1922 - November 12, 2018) was an American comic book 
    writer, editor, publisher, and producer. He rose through the ranks of a family-run business to become Marvel Comics' 
    primary creative leader for two decades, leading its expansion from a small division of a publishing house to
    multimedia corporation that dominated the comics industry.
    """

nlp = pipeline('question-answering')
result = nlp(context=context, question="Who is Stan Lee?")
"""

`Output 1: Report for Default Transformer QA Model`

{'score': 0.2854291316652837,
 'start': 95,
 'end': 159,
 'answer': 'an American comic book writer, editor, publisher, and producer.'}

BOOM! It works! That low confidence score is a little worrisome, though. You’ll see how that comes into play later, when we talk about BERT’s ability to detect impossible questions and irrelevant contexts. However, taking some time to choose the right model for your task will ensure that you are getting the best possible out of the box performance from your conversational agent. Your choice of both language models and a benchmarking dataset will make or break the performance of your chatbot. BERT (Bidirectional Encoding Representations for Transformers) models perform very well on complex information extraction tasks. They can capture not only meaning of words, but also the context. Before choosing model (or settling for the default option) you probably want to evaluate your candidate model for accuracy and resources (RAM and CPU cycles) to make sure that it actually meets your expectations. In this article you will see how we benchmarked our qa model using Stanford Question Answering Dataset (SQuAD). There are many other good question-answering datasets you might want to use, including Microsoft’s NewsQA, CommonsenseQA, ComplexWebQA, and many others. To maximize accuracy for your application you’ll want to choose a benchmarking dataset representative of the questions, answers, and contexts you expect in your application. Huggingface Transformers library has a large catalogue of pretrained models for a variety of tasks: sentiment analysis, text summarization, paraphrasing, and, of course, question answering. We chose a few candidate question-answering models from the repository of available models. Lo and behold, many of them have already been fine-tuned on the SQuAD dataset. Awesome! Here are a few SQuAD fine-tuned models we are going to evaluate:

distilbert-base-cased-distilled-squad
bert-large-uncased-whole-word-masking-finetuned-squad
ktrapeznikov/albert-xlarge-v2-squad-v2
mrm8488/bert-tiny-5-finetuned-squadv2
twmkn9/albert-base-v2-squad2

We ran predictions with our selected models on both versions of SQuAD (version 1 and version 2). The difference between them is that SQuAD-v1 contains only answerable questions, while SQuAD-v2 contains unanswerable questions as well. To illustrate this, let us look at the below example from the SQuAD-v2 dataset. An answer to Question 2 is impossible to derive from the given context from Wikipedia:

Question 1: “In what country is Normandy located?” Question 2: “Who gave their name to Normandy in the 1000’s and 1100’s” Context: “The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th and 11th centuries gave their name to Normandy, a region in France. They were descended from Norse (“Norman” comes from “Norseman”) raiders and pirates from Denmark, Iceland and Norway who, under their leader Rollo, agreed to swear fealty to King Charles III of West Francia. Through generations of assimilation and mixing with the native Frankish and Roman-Gaulish populations, their descendants would gradually merge with the Carolingian-based cultures of West Francia. The distinct cultural and ethnic identity of the Normans emerged initially in the first half of the 10th century, and it continued to evolve over the succeeding centuries.”

Our ideal model should be able to understand that context well enough to compose an answer. Let us get started! To define a model and a tokenizer in Transformers, we can use AutoClasses. In most cases Automodels can derive the settings automatically from the model name. We need only a few lines of code to set it up:

`Input 2: Load Large Uncased BERT Transformer Pretuned for SQuAD`

from tqdm import tqdm
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

modelname = 'bert-large-uncased-whole-word-masking-finetuned-squad'

tokenizer = AutoTokenizer.from_pretrained(modelname)
model = AutoModelForQuestionAnswering.from_pretrained(modelname)

We will use the human level performance as our target for accuracy. SQuAD leaderboard provides human level performance for this task, which is 87% accuracy of finding the exact answer and 89% f1 score. You might ask, “How do they know what human performance is?” and “What humans are they talking about?” Those Stanford researchers are clever. They just used the same crowd-sourced humans that labeled the SQuAD dataset. For each question in the test set they had multiple humans provide alternative answers. For the human score they just left one of those answers out and checked to see if it matched any of the others using the same text comparison algorithm that they used to evaluate the machine model. The average accuracy for this “leave one human out” dataset is what determined the human level score that the machines are shooting for. To run predictions on our datasets, first we have to transform the SQuAD downloaded files into computer-interpretable features. Luckily, the Transformers library already has a handy set of functions to do exactly that:

`Input 3: Load and Preprocess SQuAD v2`

from transformers import squad_convert_examples_to_features
from transformers.data.processors.squad import SquadV2Processor

processor = SquadV2Processor()
examples = processor.get_dev_examples(path)
features, dataset = squad_convert_examples_to_features(
    examples=examples,
    tokenizer=tokenizer,
    max_seq_length=512,
    doc_stride = 128,
    max_query_length=256,
    is_training=False,
    return_dataset='pt',
    threads=4, # number of CPU cores to use
)

We will use PyTorch and its GPU capability (optional) to make predictions:

`Input 4: Prediction (Inference) with a Transformer`

import torch
from torch.utils.data import DataLoader, SequentialSampler

eval_sampler = SequentialSampler(dataset)
eval_dataloader = DataLoader(dataset, sampler=eval_sampler, batch_size=10)

all_results = []

def to_list(tensor):
    return tensor.detach().cpu().tolist()

for batch in tqdm(eval_dataloader):
    model.eval()
    batch = tuple(t.to(device) for t in batch)

    with torch.no_grad():
        inputs = {
            "input_ids": batch[0],
            "attention_mask": batch[1],
            "token_type_ids": batch[2], # NOTE: skip token_type_ids for distilbert
        }

        example_indices = batch[3]

        outputs = model(**inputs)  # this is where the magic happens

        for i, example_index in enumerate(example_indices):
            eval_feature = features[example_index.item()]
            unique_id = int(eval_feature.unique_id)

Importantly, the model inputs should be adjusted for a DistilBERT model (such as distilbert-base-cased-distilled-squad). We should exclude the “token_type_ids” field due to the difference in DistilBERT implementation compared to BERT or ALBERT to avoid the script erroring out. Everything else will stay exactly the same. Finally, to evaluate the results, we can apply squad_evaluate() function from Transformers library:

`Input 5: transformers.squad_evaluate()`

from transformers.data.metrics.squad_metrics import squad_evaluate

results = squad_evaluate(examples, 
                         predictions,
                         no_answer_probs=null_odds)

Here is an example report generated by squad_evaluate:

`Output 5: Accuracy Report from transformers.squad_evaluate()`

OrderedDict([('exact', 65.69527499368314),
             ('f1', 67.12954950681876),
             ('total', 11873),
             ('HasAns_exact', 62.48313090418353),
             ('HasAns_f1', 65.35579306586668),
             ('HasAns_total', 5928),
             ('NoAns_exact', 68.8982338099243),
             ('NoAns_f1', 68.8982338099243),
             ('NoAns_total', 5945),
             ('best_exact', 65.83003453213173),
             ('best_exact_thresh', -21.529870867729187),
             ('best_f1', 67.12954950681889),
             ('best_f1_thresh', -21.030719757080078)])

Now let us compare exact answer accuracy scores (“exact”) and f1 scores for the predictions generated for our two benchmarking datasets, SQuAD-v1 and SQuAD-v2. All models perform substantially better on the dataset without negatives (SQuAD-v1), but we do have a clear winner (ktrapeznikov/albert-xlarge-v2-squad-v2). Overall, it performs better on both datasets. Another great news is that our generated report for this model matches exactly the report posted by the author. The accuracy and f1 fall just a little short of the human level performance, but is still a great result for a challenging dataset like SQuAD.

Table 1: Accuracy Scores for Each of 5 Models on SQuAD v1 & v2

We are going to compare the full reports for SQuAD-v2 predictions in the next table. Looks like ktrapeznikov/albert-xlarge-v2-squad-v2 did almost equally well on both tasks: (1) identifying the correct answers to the answerable questions, and (2) weeding out the answerable questions. Interestingly though, bert-large-uncased-whole-word-masking-finetuned-squad offers a significant (approximately 5%) boost to the prediction accuracy on the first task (answerable questions), but completely failing on the second task.

Table 2: Separate Accuracy Scores for Impossible Questions

We can optimize the model to perform better on identifying unanswerable questions by adjusting the null threshold for the best f1 score. Remember, the best f1 threshold is one of the outputs computed by the squad_evaluate function (best_f1_thresh). Here is how the prediction metrics for SQuAD-v2 change when we apply the best_f1_thresh from the SQuAD-v2 report:

`Input 6: Accuracy Report from transformers.squad_evaluate()`

report = dict(squad_evaluate(
    examples, 
    predictions, 
    no_answer_probs=null_odds, 
    no_answer_probability_threshold=best_f1_thresh))

Table 3: Adjusted Accuracy Scores

While this adjustment helps the model more accurately identify the unanswerable questions, it does so at the expense of the accuracy of answered questions. This trade-off should be carefully considered in the context of your application. Let’s use the Transformers QA pipeline to test drive the three best models with a few questions of our own. We picked the following the following passage from a Wikipedia article on computational linguistics:

`Input 7: Computational Linguistics Questions (an unseen test example)`

context = '''
Computational linguistics is often grouped within the field of artificial intelligence 
but was present before the development of artificial intelligence.
Computational linguistics originated with efforts in the United States in the 1950s to use computers to automatically translate texts from foreign languages, particularly Russian scientific journals, into English.[3] Since computers can make arithmetic (systematic) calculations much faster and more accurately than humans, it was thought to be only a short matter of time before they could also begin to process language.[4] Computational and quantitative methods are also used historically in the attempted reconstruction of earlier forms of modern languages and sub-grouping modern languages into language families.
Earlier methods, such as lexicostatistics and glottochronology, have been proven to be premature and inaccurate. 
However, recent interdisciplinary studies that borrow concepts from biological studies, especially gene mapping, have proved to produce more sophisticated analytical tools and more reliable results.[5]
'''
questions=['When was computational linguistics invented?',
          'Which problems computational linguistics is trying to solve?',
          'Which methods existed before the emergence of computational linguistics ?',
          'Who invented computational linguistics?',
          'Who invented gene mapping?']

Note that the last two questions are impossible to answer from the given context. Here is what we got from each model we tested:

`Output 7: Computational Linguistics Questions (last two are impossible questions)`

Model: bert-large-uncased-whole-word-masking-finetuned-squad
-----------------
Question: When was computational linguistics invented?
Answer: 1950s (confidence score 0.7105585285134239)

Question: Which problems computational linguistics is trying to solve?
Answer: earlier forms of modern languages and sub-grouping modern languages into language families. (confidence score 0.034796690637104444)

Question: What methods existed before the emergence of computational linguistics?
Answer: lexicostatistics and glottochronology, (confidence score 0.8949566496998465)

Question: Who invented computational linguistics?
Answer: United States (confidence score 0.5333964470000865)

Question: Who invented gene mapping?
Answer: biological studies, (confidence score 0.02638426599066701)

Model: ktrapeznikov/albert-xlarge-v2-squad-v2
-----------------
Question: When was computational linguistics invented?
Answer: 1950s (confidence score 0.6412413898187204)

Question: Which problems computational linguistics is trying to solve?
Answer: translate texts from foreign languages, (confidence score 0.1307672173261354)

Question: What methods existed before the emergence of computational linguistics?
Answer:  (confidence score 0.6308010582306451)

Question: Who invented computational linguistics?
Answer:  (confidence score 0.9748902345310917)

Question: Who invented gene mapping?
Answer:  (confidence score 0.9988990117797236)

Model: mrm8488/bert-tiny-5-finetuned-squadv2
-----------------
Question: When was computational linguistics invented?
Answer: 1950s (confidence score 0.5100432430158293)

Question: Which problems computational linguistics is trying to solve?
Answer: artificial intelligence. (confidence score 0.03275686739784334)

Question: What methods existed before the emergence of computational linguistics?
Answer:  (confidence score 0.06689302592967117)

Question: Who invented computational linguistics?
Answer:  (confidence score 0.05630986208743849)

Question: Who invented gene mapping?
Answer:  (confidence score 0.8440988190788303)

Model: twmkn9/albert-base-v2-squad2
-----------------
Question: When was computational linguistics invented?
Answer: 1950s (confidence score 0.630521506320747)

Question: Which problems computational linguistics is trying to solve?
Answer:  (confidence score 0.5901262729978356)

Question: What methods existed before the emergence of computational linguistics?
Answer:  (confidence score 0.2787252009804586)

Question: Who invented computational linguistics?
Answer:  (confidence score 0.9395531361082305)

Question: Who invented gene mapping?
Answer:  (confidence score 0.9998772777192002)

Model: distilbert-base-cased-distilled-squad
-----------------
Question: When was computational linguistics invented?
Answer: 1950s (confidence score 0.7759537003546768)

Question: Which problems computational linguistics is trying to solve?
Answer: gene mapping, (confidence score 0.4235580072416312)

Question: What methods existed before the emergence of computational linguistics?
Answer: lexicostatistics and glottochronology, (confidence score 0.8573431178602817)

Question: Who invented computational linguistics?
Answer: computers (confidence score 0.7313878935375229)

Question: Who invented gene mapping?
Answer: biological studies, (confidence score 0.4788379586462099)

As you can see, it is hard to evaluate a model based on a single data point, since the results are all over the map. While each model gave the correct answer to the first question (“When was computational linguistics invented?”), the other questions proved to be more difficult. This means that even our best model probably should be fine-tuned again on a custom dataset to improve further.

Take away:

Open source pretrained (and fine-tuned!) models can kickstart your natural language processing project.
Before anything else, try to reproduce the original results reported by the author, if available.
Benchmark your models for accuracy. Even models fine-tuned on the exact same dataset can perform very differently.

Leave a comment or send me an e-mail if I can help you get started with pretrained transformers! Leave a comment or send me an e-mail if I can help you get started with pretrained transformers!

About The Author

Olesya Bondarenko is Lead Developer at Tangible AI where she leads the effort to make QAry smarter. QAry is an open source question answering system you can trust with your most private data and questions.

Getting Started with Python and Data Analysis

2020-07-15T00:00:00-04:00

We not only help international organizations to start using machine intelligence, we also help new developers and Data Scientists get their start.
Here are our favorite tutorials and resources to help you get started if you're new to Python or data science.

If you like working on some of these exercises and want to find some ways to use python for social impact, consider applying for our quarterly internship. Each qaurter we take on two unpaid interns that we guide on their journey from zero to data science hero in only two and a half months.
We even have a chatbot to help you along the way with daily encouragement, guidance and exercises.

Programming

Here you should find everything you need to get started in programming.
And there will be ways for programmers of all skill level to develop their craft.
Even I use the resources in this list to "level up" my skills now and then.

Complete Beginner

If you have no programming experience at all, this is a great place to start.
Python was designed to be the most readable and learnable programming language.
Plus you'll be learning the most powerful and popular programming language

Open Source Society University: Everything for complete free college degree
CSDojo YouTube: Complete Beginner Intro to Python RECOMMENDED
PythonSpot: Simple text game
Introduction to Computer Science and Programming using Python RECOMMENDED by David Valdez

Datasets & Project Ideas

Here are some datasets, blog posts, and tutorials that might spark some ideas for projects of your own.
One of the most reliable ways to learn something is to duplicate someone else's work and then play with it or break it.
Eventually you may be able to modify, extend, or generalize it in some way that extends the state of the art.

NLP

How nonprofits are using chatbots to scale their impact

2019-05-20T00:00:00-04:00

Artificial intelligence-powered chatbots have become ubiquitous in today’s world. Increasingly, more non-profit organizations are creating chatbots for different purposes. AI-powered chatbots help fundraising efforts, help assist with education, and provide an artificial support staff that is available around the clock. One of the biggest benefits that non-profits see is that chatbots allow them to scale their impact, serve more people than ever, all without increasing business costs.

In this article, we show you 5 examples of real-world use cases for chatbots that have already been tried and implemented in the nonprofit sector. As you will see, chatbots and artificial intelligence are not distant future - they're here, and they are here to stay.

Automating Large-Scale Campaigns

Large-scale campaigns typically require a large support base which means putting a lot of feet on the ground. It can be impossible to support the target audience when these many resources are required. Chatbots can automate that support role, providing a low-cost, virtual, and always available experience that is both engaging and interactive.

Recently, the Fight for Future Foundation launched a chatbot to help with their voter registration campaign. Their goal was to increase voter registration and encourage citizens to visit their local polling office and vote. Launching this type of campaign meant a large scale, pro-active outreach, though.

Fight for Future Foundation settled on virtualizing their outreach programs through chatbots. It is estimated that more than 150,000 unique people interacted with their chatbot which resulted in more than 50,000 new registrations. Employing a live workforce to handle numbers at this scale would have been infeasible for almost any organization.

Personalize Support Without Increased Staff

Non-profits require a more personalized support mechanism than any other business segment. NGOs must connect with their audience on a personal level. Yet, scaling to provide that kind of personal interaction can be extremely difficult and expensive. By providing small, digestible pieces of information, and easy-to-use user experience, chatbots can help a targeted audience find exactly the information they need quickly.

Take rAInbow) for example. This chatbot is part of a campaign that helps create awareness about domestic violence. That can be a very difficult conversation to have with people, much less convince victims to come forward! Rainbow has created a chatbot that helps people learn more about their cause in a personal way and helps victims come forward.

Connecting to Younger Audiences

Younger generations have fully embraced the web, including all its technologies it has to offer. As a result, millennials (now in the 25-35 year age group) use chatbots more than most. 47% of millennials have reported that they interact with a chatbot regularly. Considering this age group is now the target audience for most organizations, it would be wise to interact with them in a familiar way.

Consider Roo as an example. Roo is a chatbot created by Planned Parenthood that helps young people find answers to sexual and parenting questions. These can be rather personal conversations. Yet Roo can help this target audience 24x7 and has had so much success that it answered more than 800,000 questions in the first 6 months after being launched. Roo has also helped people find new primary practitioners that offer better services and have scheduled more than 1,000 appointments, too!

Increasing Awareness and Advocacy for Organization's Cause

Advocacy is one of the most important parts of any mission for an NGO. Advocacy is what explains the mission of a non-profit and convinces people to stand behind them. It creates legitimacy. In many cases, though, it can be hard to educate the target audience about crucial and complex subjects. Chatbots can help people learn about any given subject and create a sort of empathy that could not be translated through human interaction. These experiences can be broken down into digestible stories that engage the audience and offer unique resources in understanding.

Mencap’s chatbot, Irene, is a great example of this. Irene acts as a guided example through understanding learning disabilities. It poses common responses that viewers can select in something that almost resembles a game. Yet the language Irene uses is both engaging and informative.

Building an Engaged Community

Engagement is one of the hardest challenges facing any organization, much less non-profits. It can be difficult to get people to listen to how you can help them and get a response back. Farm.Ink understood this problem. They built a chatbot that supports a community of more than 50,000 farmers in Sub-Saharan Africa. Farm.ink's chatbot connected to Africa Farmers Club group helps farmers find critical information inside the group, hear the latest news, and learn new things.

Other platforms, like Resistbot, are similar but serve a different audience. Resistbot is a real-time notification system for political activism. It will tell people about events in their vicinity and help protesters organize.

Eager to build a chatbot for your own organization? You don’t need an AI expert with a computer science degree to build one for your non-profit. It’s easier than ever today! Tools like ManyChat and Chatfuel let you make a messenger bot in minutes. If you need a more complex tool, one that can listen and respond to natural language, services like Google’s Dialogflow are free and easy to start with.

Do you feel like you need a more complex solution? Contact us and we will be happy to help build out a successful chatbot for your NGO.

TangibleAI

How to Use Quizzes in your Chatbots to Educate Your Audience

1. Respond to User Preferences

2. Increase Engagement

3. Learn What People Are Learning

10 Tips for Making Quality Chatbot Quizzes

Improve your chatbot by analyzing user messages in 5 steps (Part 2)

Questions / Queries

Navigation

Small Talk

Feedback

Out of Scope/Spam

Improve your chatbot by analyzing user messages in 5 steps (Part 1)

How to handle the free-text chatbot responses: A step-by-step guide

Step 1 - Extract

Step 2 - Cluster

Step 3 - Develop Intent Model

Step 4 - Label

Step 5 - Analyze

Choosing a Channel for Your Nonprofit Chatbot

Key Questions When Decided on a Channel for Your Nonprofit Chatbot

How will users find the bot?

5 Channel Options – Pros and Cons

Chatbot on Your Website / Landing Page

SMS – Text Messaging Chatbots

WhatsApp for Nonprofit Chatbots

Facebook Messenger Chatbots

Telegram for Nonprofit Chatbots

Other Channels

Can I Have a Nonprofit Chatbot on More than One Channel?

Training a Python to Explore Holes in Dark Patterns

pandas.read_html

Dark Patterns

Illuminating the Dark

Resources

Accelerate Your Creativity with Automation

workon

Use case

Installation

Mind hack

Linting

Which flavor of BERT should you use for your QA task?

Input 1: Load Pretrained Transformer QA Model

Output 1: Report for Default Transformer QA Model

Input 2: Load Large Uncased BERT Transformer Pretuned for SQuAD

Input 3: Load and Preprocess SQuAD v2

Input 4: Prediction (Inference) with a Transformer

Input 5: transformers.squad_evaluate()

Output 5: Accuracy Report from transformers.squad_evaluate()

Input 6: Accuracy Report from transformers.squad_evaluate()

Input 7: Computational Linguistics Questions (an unseen test example)

Output 7: Computational Linguistics Questions (last two are impossible questions)

Take away:

About The Author

Getting Started with Python and Data Analysis

Programming

Complete Beginner

Interactive Python Self-Learning

Intermediate Python and Beginner Data Science

Certificated Online Learning

Datasets & Project Ideas

NLP

Meta

How to Think and Learn

Algorithms

Design Patterns

Mentoring Schools

Python Resource Lists

How nonprofits are using chatbots to scale their impact

Automating Large-Scale Campaigns

Personalize Support Without Increased Staff

Connecting to Younger Audiences

Increasing Awareness and Advocacy for Organization's Cause

Building an Engaged Community

`pandas.read_html`

`workon`

`Input 1: Load Pretrained Transformer QA Model`

`Output 1: Report for Default Transformer QA Model`

`Input 2: Load Large Uncased BERT Transformer Pretuned for SQuAD`

`Input 3: Load and Preprocess SQuAD v2`

`Input 4: Prediction (Inference) with a Transformer`

`Input 5: transformers.squad_evaluate()`

`Output 5: Accuracy Report from transformers.squad_evaluate()`

`Input 6: Accuracy Report from transformers.squad_evaluate()`

`Input 7: Computational Linguistics Questions (an unseen test example)`

`Output 7: Computational Linguistics Questions (last two are impossible questions)`