AI | Waah Notes

AI

AI
Table of Contents
Introduction
Basics
Brief
Key Concepts
Popular Algorithms
Tools and Libraries
References

Introduction

🔝

Simulations of human intelligence through computer systems.
Utilizes algorithms and data to function
Enables machines to perform tasks requiring human intelligence.

Basics

🔝

History of AI

1950s:
- Alan Turing’s Test
- McCarthy coined “Artificial Intelligence”
1980s:
- Machine Learning boom
1990s:
- Neural Networks
2000s:
- Deep Learning’s rise
2010s:
- NLP and Computer Vision
2020s:
- Deep Learning Models, Autonomous Systems and Healthcare Applications.

How does AI Learn

Supervised Learning: When algorithm is trained based on human labeled data, more the samples, better the model
- Types of supervised learning
  - Regression: Models the relationship between input features (x) and a continuous output variable (y). Regression is used to estimate or predict continuous values.
  - Classification: Assigns discrete class labels (y) to input features (x). Classification focuses on identifying which category or class an input belongs to.
  - Neural Network: Structures that imitate the human brain to process input data, recognize patterns, and make decisions or predictions. Neural networks can be used for both regression and classification tasks.
Unsupervised Learning: Relies on giving the algorithm unlabeled data and it finds the patterns itself, useful for clustering similar data points and detecting anomalies.
Reinforcement Learning: The algorithm is given a set of rules, goals, allowed actions, and constraints. It learns by trying different actions, receiving rewards for good decisions and penalties for bad ones, aiming to maximize its total reward. Used for tasks like teaching a machine to play chess or navigate an obstacle course.
Training a model involves splitting a dataset into training, validation, and testing sets.
- Training set: Trains the algorithm
- Validation set: Fine-tunes and validates the model
- Test set: Evaluates the model’s performance

Types of AI

Strength:
1. Narrow/Weak AI (narrow AI)
  - AI tailored for specific domains
  - Decision-making based on programmed algorithms and training data
  - Ex: Translator, Recommendation System
2. General/Strong AI (generalized AI)
  - An AI with diverse capabilities across unrelated tasks
  - Acquires new skills to face new challenges
  - Is a combination of various AI strategies
  - Ex: Finance, HR, R&D, Supply Chain
3. Super/Conscious AI (conscious AI)
  - AI with human level consciousness
  - Requires self-awareness
  - Ex: Healthcare, autonomous vehicles, robotics, Natural Language Understanding
Breadth:
Applications:

Fundamental Approaches to AI

Discriminative AI: Approach that learns to distinguish between different classes of data, best for classification task and cannot understand context or generate new content.
Generative AI: Generates new content based on training data, i……

Augmented Intelligence

Combines human strengths with machine intelligence.
Uses AI to help us see and understand the world in new ways.
Enables humans to accomplish things impossible alone.

Branches of AI

🔝

Cognitive Computing
- Focuses on creating systems that simulate/mimic human thought processes
- Involves intellectual activities like
  - Thinking
  - Reasoning
  - Problem-Solving
- Core Elements of Cognitive Computing
  - Perception: Involves gathering and interpreting data from various sources to understand the environment.
  - Learning: Utilizes machine learning algorithms to analyze data and extract meaningful insights, improving over time.
  - Reasoning: Making accurate predictions and decisions based on data analytics.
  - Reasoning:
Machine Learning
- Subset of AI that analyzes data using computer algorithms and makes intelligent decisions based on its learning without being explicitly programmed.
- Are
  - Trained with large data sets
  - Learns from examples, not rules
  - Enables machine to solve problems independently
  - Makes accurate predictions using given data
Deep Learning
- Specialized subset of machine learning, and uses multi-layered neural network known as deep neural networks to analyze complex data and simulate human decision making.
- It allows continuous improvement and learning.
- Enhances AI’s natural language understanding by grasping context and intent.
- Neural Networks: Computation model inspired by brain’s neural structure, composed of interconnected layers, mainly 3 layers.
  - Input Layer: Receives data
  - Hidden Layers: Processes data through multiple layers
  - Output Layer: Produces results
Generative AI 💉
- Generative AI refers to a class of artificial intelligence technologies that enable machines to autonomously create new content.
Natural Language Processing
- Subset of AI that enables computers to comprehend, interpret, and produce natural language.
- NLP translates unstructured text into structured data and it involves two main components:
  - Natural Language Understanding (NLU) for converting unstructured to structured data.
  - Natural Language Generation (NLG) for the reverse.
- Subcategories:
  - Speech-to-Text
  - Text-to-Speech
Computer Vision
- Field of AI which interprets and comprehends visual data, analyzes picture or video data to derive conclusions.
- Applications:
  - Image Classification
  - Object Detection
  - Image Segmentation Techniques

AI Agents

AI agents are software programs that autonomously interact with their environment, process data, and perform tasks to achieve human-defined goals.

Robotics

Involves designing, constructing and operating robots
Made up of components:
- Sensors: Acts as robot’s eye and ears, gathers informations, detect obstacles
- Actutator: Acts as robot’s muscles, enables movement and interaction. Includes motos, hydraulic etc
- Controllers: Acts as robot’s brain, runs the software that control the robot parts and interpret sensor data
Cobots: Robots that collaborate with humans using advanced sensors and AI technologies, communicate and coordinate actions which require teamwork.

Prompts

💉

Prompt is any input or series of instructions used to produce a desired output, and help in directing the creativity of generative model.
Building blocks of well structured prompt include:
- Instruction
- Context
- Input Data
- Output Indicators

Brief

🔝

Generative AI

Generative AI refers to a class of artificial intelligence technologies that enable machines to autonomously create new content.
Sub-models
- Language Models
  - GPT Models: Used for text generation (e.g., ChatGPT).
  - Palm Models: Focused on text-in and out workflows.
  - Gemini Models: Multimodal capabilities, including text and image processing.
- Image Generation Models
  - Stable Diffusion: Advanced text-to-image technology.
  - DAL-E Model: Generates images that match input text.
  - StyleGAN: Produces high-quality images of faces and objects.
  - Super Resolution: Enhances image resolution by increasing pixel count.
- Voice and Music Generation Models
  - Murph: AI voice generation technology that replicates human speech nuances.
  - OpenAI Whisper: Transcription and translation model.
  - Jukedeck and Amper Music: AI-powered music generators for original tracks.
  - AIVA: Generates songs in various styles.
- Video Generation Models
  - Google’s Imogen Video: Generates high-definition videos.
  - OpenAI Sora: Creates realistic scenes from text instructions.
Generative AI Models
- Generative AI models are AI systems that learn patterns from large datasets to create new content such as text, images, music, or video. Their design varies based on the task and data type.
- Common types include (How does GenAI develop creativity?):
  - VAEs (Variational Autoencoders): Encode input data into a latent space and decode it to generate new outputs. Used for image generation and anomaly detection (e.g., Fashion MNIST clothing images).
  - GANs (Generative Adversarial Networks): Two networks (generator and discriminator) compete, improving each other. Used for image synthesis and style transfer (e.g., Nvidia’s StyleGAN for realistic faces).
  - Autoregressive Models: Generate data step-by-step, using previous outputs as context. Useful for text and audio (e.g., WaveNet for speech).
  - Transformers: Use encoder-decoder layers for tasks like text generation and translation (e.g., GPT, Gemini).
  Models can be:
  - Unimodal: Input and output are the same type (e.g., GPT-3: text-to-text).
  - Multimodal: Handle multiple data types (e.g., DALL-E: text-to-image; ImageBind: combines text, audio, visuals).
Generative AI vs Traditional AI
- Traditional AI focuses on analyzing and predicting from existing data, while generative AI creates new data that resembles the training data.
- Generative AI represents a shift from rule-based systems to creative models that can invent and generate content independently.
Generative AI vs Agentic AI
- Generative AI:
  - Reactive (works on prompt)
  - Prompts -> Generate Content
  - Use-case: Content creation, which then is reviewed, refined, directed by a human
- Agentic AI:
  - Pro-active (can work independently)
  - Prompts -> Actions (Percive -> Decide -> Execute -> Learn)
  - Use-case: Multi-Step process (uses reasoning power of LLMs) to break down complex tasks into smaller, logical steps also known as chain of though reasoning (ref prompt eng)
Text Geneation: Large Language Models (LLMs)
- LLMs interpret context, grammar, and semantics to generate coherent text based on learned patterns.
- Examples of LLMs include GPT (Generative Pre-trained Transformer) and Google Bard (powered by PaLM as LLM, where PaLM is a combination of transformer model and Google’s Pathways AI platofrm), which have evolved into multimodal models.
- Capabilities of Text Generation Tools:
  - ChatGPT can handle both text and image inputs, providing context based responses and assisting in creative tasks like slide creation.
  - Problem solving through basic mathematics and statistics, financial analysis, investment research, code geenration etc.
- Other text generators:
  - Jasper: Generates high quality marketing content tailored to brand’s voice.
  - Rytr: Contect for blogs, email, SEO metadata, and ads on social media.
  - Copy.ai: Content for social media, marketing, and product descriptions.
  - Writesonic: Specific templates for different types of text.
- Tools for specific use cases:
  - Resoomer: Text Summarization
  - uClassify: Text classfiication
  - Brand24, Repostate: Sentiment Analysis
  - Weaver: Language Translation
Image Generation
- Generative AI can create new images, modify images based on text prompts.
- Image Generation Models:
  - DALL-E by OpenAI: Based on GPT, can generate high resolution images in multiple styles, new versions can generate multiple image variations.
  - Stable Diffusion: Open-source model, creates high-resolution images, generated images based on text prompts, and used for image-to-image translation, inpainting, and outpainting.
  - StyleGAN: Enables precise control for manipulating speivif features, seperate image content and image style and evovled to generate higher resolution images.
- Image Genetation Tools:
  - Freepik, Craiyon, PicsArt, Firefly: Free tools
  - Fotor and Deep Art Effects: Offer vairous pretrained styles and allow custom styles.
  - DeepArt.io: Turns photos into artwork
  - MidJourney: Enables image generator communities and helps artists and designers to create image using AI.
Audio/Video Generation
- Speech generation:
- Music creation:
- Audio enhancement tools:
Code Generation
- Generates code based on natural language input leveraging Deep learning and NLP to understand context
- These models can generate new code snippets, complete partial code, optimize existing code, convert code programming languages.
- Tools & Models:
  - GPT: Generates human-like text and code.
  - Github Copilot: Powered by OpenAI Codex, and trained on natural langugage text and source code like github, integrates with code editors.
  - PolyCoder: Open source AI code generator, based on GPT, trained on github repositories, can create, review, refine code snippets.
  - IBM Watson, Amazon CodeWhisperer, Tabnine, Replit
- Benefits:
  - Increase productivity and quality of code.
  - Enable rapid prototyping to iterate on design ideas.
  - Enables cross-platform compatibility and migration.
  - Foster consistent coding standards.
Large Language Models and Transformers
- LLMs use deep learning techniques on massive datasets containing books, articles, and websites.
- LLMs excel in natural language processing
- Popular LLMs:
  - OpenAi’s GPT
  - Google’s LaMDA
  - Meta’s Llama
  - Antropic’s Claude
- Capabilities of LLMs:
  - Generating code templated and snippets
  - Assisting with documents and comments
  - Automated testing and QA
  - Supporting collaborative coding environments
Natural Language Processing
- NLP involves computational techniques to analyze, understand, and generate human language, integrating linguistics, computer science, and artificial intelligence.
- Key NLP techniques include sentiment analysis, named entity recognition, text classification, and machine translation.
- Libraries and Tools:
  - Common NLP libraries and tools include IBM Watson, Google Cloud NLP API, NLTK, SpaCy, and Stanford Core NLP.
- Use cases:
  - Sentiment Analysis: Analyzes text to determine the sentiment expressed. Commonly used in social media monitoring and customer feedback analysis.
  - Named Entity Recognition (NER): Identifies and classifies key entities in text, such as names of people, organizations, and locations. Useful for information extraction and data organization.
  - Text Classification: Automatically categorizes text into predefined labels or categories. Applications include spam detection in emails and topic categorization in news articles.
  - Machine Translation: Translates text from one language to another using NLP techniques. Enhances communication across language barriers, as seen in translation apps and services.
  - Chatbots and Conversational Agents: Simulates human conversation to assist users with queries and tasks. Utilizes NLP for understanding user intent and generating appropriate responses.
  - Information Extraction: Extracts structured information from unstructured text data. Applications include resume parsing and extracting key details from documents.
  - Summarization: Generates concise summaries of lengthy texts, making it easier to grasp essential information quickly.
  - Analyze sentiments in text: Assigning polairty scores to wrods or phrases.
  - Implement sentiments analysis algorithms: Includes rule-based methods, machine learning algorithms such as naive bais, support vector machines, and deep learning models like recurrent neural networks or transformers.
  - Extract named entities from text using NER: Classifying named entities in text using patterns and rules.
  - Build NER models: Using supervised or unsupervised learning approaches.
  - Build chatbots
  - Implement ML translation models
  - Design conversational agents

Generative AI in Software Development

CI/CD:
- Continuous Integration & Continuous Development
- Role of AI in CI/CD
  - Code Analysis and optimization
  - Employs machine learning algorithms
  - Detects patterns and code issues
  - Recommends optimization
  - Enhances code quality and security
  - Adheres to coding standards
  - Ensures higher-quality code
  - Faciliates faster delivery cycle
- Tools and Platforms
  - Jenkins: Possess CI?CD capabilities
  - IBM Watson Studio and Watson Machine Learning: Offers DevOps automations for machine learning
  - Codefresh: Enhances the development process by predictive scaling
  - Gitlab CI/CD: Streamlines CI/CD management
  - PagerDuty AIOps: Aids engineering teams in incident response
  - harness: Helps software release process
  - Snyk: Automates security testing using AI and ML
Security in Software
- Secure Coding Tools:
  - Tools like Qwiet AI Pre-Zero, Snyk Code, and GitHub Advanced Security help identify and remediate code vulnerabilities.
  - AI-driven cybersecurity solutions, such as Sophos Intercept X and IBM QRadar, enhance endpoint protection.
- Cybersecurity Solutions:
  - Sophos Intercept X uses deep learning for endpoint protection, and Symantec Endpoint Security employs machine learning to identify vulnerabilities.
- Software Testing
  - Use-Cases:
    - Test data generation: Generate diverse and representative test data through strategies like data synthesis, ensuring the data accurately reflects real-world scenarios.
    - Automated Test Input Generation: Machine learning algorithms automatically generate test inputs based on the software’s behavior, maximizing code coverage and targeting specific areas of interest.
    - Automated Test Generation: Simplifies the creation of effective test cases from software requirements, automating the process through machine learning and NLP.
    - Combinatorial Testing: This technique generates a minimal set of test cases that cover all combinations of input parameters, ensuring comprehensive coverage while reducing the number of required test cases.

Prompt Engineering

Process of designing effective prompts to leverage the full capabilities of generative AI models and producing optimal responses.
Importance of Prompt Engineering:
- Optimizing model efficiency
- Boosting performance for specific task
- Understanding model constraints
- Enhancing model security
Best Practices for Prompt Creation:
- Clarity: Use clear and concise language, avoid jargon and complex terms, and provide explicit instructions.
- Context: Establish the context, provide relevant information.
- Precision: Be specific and use examples.
- Role-Play/Persona Pattern: Assume a persona, and provide context for role-play.
Prompt Engineering Tools:
- IBM watsonx.ai Prompt Lab
- Spellbook: IDE by ScaleAI
- Dust: WebInterface, Supports API intergation with other models and services
- PromptPerfect: Optimize prompts for different LLMs for text and image models, supports GPT, Claude, StableLM, Llama, DALL-E, Stable Diffusion
- prompt-engineering (github)
- OpenAI Playground
- LangChain: Python library with functionalities for building and chaining prompts.
- PromptBase: Prompt’s marketplace
Text-to-Text Prompt Techniques:
- Task Specification
- Contextual Guidance
- Domain Expertise
- Bias Mitigation
- Framing
- User Feedback Loop
- Zero-Shot Techniques: Zero-shot prompting refers to the capability of LLMs to generate meaningful responses to prompts without needing prior training.
- Few-Shot Techniques: Few-shot prompting used with LLMs relies on in-context learning. In this technique, demonstrations are provided in the prompt to steer the model toward better performance.
Common Useful Patterns:
- Persona Pattern: A technique where you instruct the AI to assume a specific role, expertise, or personality to provide more targeted and contextual responses. This helps the model adopt the knowledge, tone, and perspective of the specified persona, leading to more relevant and specialized outputs.
```
  You are an expert software architect with 15 years of experience in designing scalable systems. Provide recommendations for...
```
- Interview Pattern: A method where you set up the AI as an interviewer who asks clarifying questions to gather necessary information before providing a comprehensive response. This ensures all relevant details are collected for optimal output generation.
```
  You will act as a SEO and content marketing expert. You will interview me, asking me (one at the time) all the relevant questions necessary for you to generate the best possible answer to my queries.
```
Chain-of-Thought approach: Approach to solve complex problems by breaking them down into smaller, easier steps.
- Few-Shot chain-of-thought: Provides examples to break down problems.
- Zero-Shot chain-of-thought: Encourages model to think on its own.
Tree-of-Though approach: Prompting strategy designed to enhance the reasoning abilities of large language models by structuring thinking as a tree-like process. Instead of generating a single linear response, the model is encouraged to explore multiple reasoning paths, evaluate them, and select the most promising one. This allows for more deliberate and interpretable decision-making, especially in complex or open-ended tasks. It allows the model to explore and evaluate multiple reasoning branches before deciding.
```
  You are planning a school fundraising event.
  Follow this structure:

  List three different types of events (label them A, B, C).

  For each event, list:
      a. Key benefits
      b. Likely challenges
      c. What resources would be needed

  Compare the three events and choose the most feasible one. Explain why it is better than the others.
```

The Playoff Method: The Playoff Method is a strategy used to compare responses generated from multiple prompts. The idea is to generate several different prompts for a given task, then compare the responses from each prompt through a pairwise comparison to determine which prompt results in the best output

  //Step1
  Can you provide me with 5 different text prompts, asking to generate a social media post celebrating NeoTech Solutions’ 100th enterprise software deployment? Also generate their possible responses. Keep it professional and concise (under 100 words). 

  Include the following elements: 
  1. Thanking the team 
  2. Highlighting innovation

  //Step2
  For each set of responses, perform a detailed pairwise comparison between every possible pair based on the following criteria: clarity of expression, coverage of the requirements, and engagement potential. Clearly state which response is stronger in each comparison, and then determine the overall strongest response by aggregating these pairwise results.

  Example:
  1. Compare Response 1 vs Response 2, noting which is clearer, more comprehensive, and more engaging.
  2. Repeat for Response 1 vs Response 3, Response 1 vs Response 4, and so on for all pairs.
  3. Summarize which responses perform best overall based on these pairwise assessments.

Text-to-Image Prompt Techniques:
- Style Modifiers: Keywords that define artistic style, medium, or visual aesthetic (e.g., “oil painting,” “photorealistic,” “anime style,” “vintage,” “cyberpunk”).
- Quality Boosters: Terms that enhance image quality and detail (e.g., “high resolution,” “4K,” “detailed,” “sharp focus,” “professional photography”).
- Repetition: Repeating key words or phrases to emphasize specific elements and increase their prominence in the generated image.
- Negative Prompts: Specifying what NOT to include using terms like “no,” “avoid,” “exclude” to prevent unwanted elements (e.g., “no text,” “no blur,” “no distortion”).
- Fix Deformed Generations: Techniques to correct common AI generation issues like extra limbs, distorted faces, or unrealistic proportions using specific corrective terms.
Prompt Hacking:
- Definition: Techniques used to manipulate AI models to bypass safety measures, access restricted information, or generate content that violates intended use policies.
- Key Differences from Prompt Engineering:
  - Prompt Engineering: Focuses on optimizing legitimate use cases and improving model performance for intended tasks
  - Prompt Hacking: Attempts to exploit model vulnerabilities and bypass safety constraints
- Common Techniques:
  - Jailbreaking: Using creative prompts to make the model ignore safety instructions
  - Role Playing: Making the model assume a character that can perform restricted actions
  - Indirect Prompting: Asking for information in roundabout ways to avoid direct restrictions
  - Context Manipulation: Providing misleading context to trick the model
- Security Implications: Understanding prompt hacking helps in developing better safety measures and robust AI systems

Gen AI Tools

Refact.ai: Streamline code organization, refactor repeated code blocks.
OpenAI Codex: Generates sample code for various architectures, ensures structure and consistency.
DiagramGPT: A tool for converting natural language descriptions into architectural diagrams.
Leonardo.ai: A tool for converting natural language descriptions into architectural diagrams.
Lucidchart: A diagramming application that integrates AI features for auto-generating system diagrams.
Microsoft's Visio: A diagramming tool that can incorporate AI features for creating visual representations of data.
PlantUML: A tool that allows developers to create UML diagrams from plain text descriptions.
``:
``:
``:

Key Concepts

🔝

Popular Algorithms

🔝

Tools and Libraries

🔝

References

🔝

AI

Table of Contents