Improve clarity of Bolt.diy Settings

aliasfox · December 18, 2024, 3:20pm

@ColeMedin
I’ve found the implementation of “Settings” misleading and have heard the same from others. I think it would be better to split the PROVIDER_LIST up from code to being called from a JSON file, which you then set all in one place (front end “Settings”). I think it would be a lot more intuitive to people and allow for new providers to be easily added without a PR or change to the code itself.

And I personally love the settings dialog and the direction, but it needs work. I’d also like to see an Experimental tab to enable certain features in development.

Just an idea and open forum to discuss. Thanks

I’d expect something more like this:

Note: Also, maybe add a Get API Key tooltip next to each Provider? Maybe even a “My Providers” list.

Remove setting the API Key from here:

Maybe something more like this?

Additionally, the .env step becomes optional.

And it would be nice to have an option to add more or pull from a list (could make some logic to pull all HuggingFace models that support their Serverless API). And we could also make the Open Router list dynamic and maybe a little cleaner.

I’d like to know your thoughts.

aliasfox · December 18, 2024, 5:35pm

For HuggingFace models, I took a look.

I can filter the list from 1211176 to 3194 records, but there is no ability to filter by “Serverless” endpoints. So, some logic to search the records would be needed to check them and build a cache.

Pre-filter Criteria:

pipeline_tag=text-generation
library=transformers
other=endpoints_compatible
language=en
search=instruct
inference_api=true"

Additional Filter

Interface API (serverless): requires checking the response
Excludes: GGUF, GPTQ, AWQ, 4Bit, 4/8Int (Quantized) Models and those under 7B parameters.

hf_prefixes.json Code:

  "provider_prefixes": [
    "codellama",
    "HuggingFaceTB",
    "deepseek-ai",
    "dnotitia",
    "meta-llama",
    "microsoft",
    "mistralai",
    "Qwen",
    "tiiuae",
    "unsloth"
  ]

script.js Code:

import fetch from "node-fetch";
import fs from "fs";
import path from "path";

// Define paths for cache and stats files
const CACHE_FILE = "./hf_cache.json";
const STATS_FILE = "./hf_stats.json";
const PROVIDER_PREFIXES_FILE = "./hf_prefixes.json";
const API_URL = "https://huggingface.co/api/models?pipeline_tag=text-generation&library=safetensors&transformers&other=endpoints_compatible&language=en&search=instruct&inference_api=true";
const useProviderPrefixes = true; 

// Fetch models from the API
async function fetchModels() {
  try {
    const response = await fetch(API_URL);
    const models = await response.json();

    if (!models || models.length === 0) {
      console.log("No models found in the API response.");
      return [];
    }

    console.log("Found models: ", models.length);
    return models;
  } catch (error) {
    console.error("Error fetching models: ", error);
    return [];
  }
}

// Check if the model is serverless
async function isServerlessModel(model) {
  if (model && model.tags && model.tags.includes("endpoints_compatible")) {
    return true;
  }
  return false;
}

// Simplify model data to only the required fields
function simplifyModelData(model) {
  return {
    modelId: model.modelId,
  };
}

// Generate stats and write to stats.json
function generateStats(serverlessModels) {
  const stats = {
    totalModels: serverlessModels.length,
    uniqueIds: new Set(serverlessModels.map(model => model.modelId)).size,
    maxLikes: Math.max(...serverlessModels.map(model => model.likes), 0),
    maxDownloads: Math.max(...serverlessModels.map(model => model.downloads), 0),
    averageLikes: serverlessModels.reduce((sum, model) => sum + model.likes, 0) / serverlessModels.length || 0,
    averageDownloads: serverlessModels.reduce((sum, model) => sum + model.downloads, 0) / serverlessModels.length || 0,
  };

  fs.writeFileSync(STATS_FILE, JSON.stringify(stats, null, 2));
  console.log("Stats generated:", stats);
}

// Read provider prefixes from the json file
function getProviderPrefixes() {
  try {
    const data = fs.readFileSync(PROVIDER_PREFIXES_FILE, "utf8");
    const json = JSON.parse(data);
    return json.provider_prefixes || [];
  } catch (error) {
    console.error("Error reading provider prefixes:", error);
    return [];
  }
}

// Extract and validate the parameter size
function isValidSize(modelId) {
  const sizeMatch = modelId.match(/(\d+(\.\d+)?)b/i); // Match for parameter sizes like 7B, 10B, 1.5B
  if (sizeMatch) {
    const size = parseFloat(sizeMatch[1]);
    return size >= 7; // only accept models >= 7B
  }
  return false;
}

// Fetch and cache serverless models
async function fetchServerlessModels(refreshCache = false) {
  if (!refreshCache && fs.existsSync(CACHE_FILE)) {
    console.log("Cache exists, loading models from cache...");
    const cachedData = JSON.parse(fs.readFileSync(CACHE_FILE, "utf8"));
    generateStats(cachedData);
    return cachedData;
  }

  console.log("Fetching models from Hugging Face API...");
  const models = await fetchModels();

  if (models.length === 0) return [];

  const serverlessModels = [];
  const providerPrefixes = useProviderPrefixes ? getProviderPrefixes() : []; 

  for (const model of models) {
    const modelId = model.modelId.toLowerCase();

    // Exclude quantized tags in modelId
    const isExcluded =
        modelId.includes("-awq") ||
        modelId.includes("-4bit") ||
        modelId.includes("-bf16") ||
        modelId.includes("-fp8") ||
        modelId.includes("-dpo") ||
        modelId.includes("-gptq") ||
        modelId.includes("-int4") ||
        modelId.includes("-int8") ||      
        modelId.includes("-quantized") ||
        modelId.includes("gguf") ||
        /(\d+\.\d+bit)s?/.test(modelId);   // Excludes XX.XXbit(s) in modelId

    if (isExcluded || !isValidSize(modelId)) {
      continue; // Skip models quantized and < 7B Parameters
    }

    const isProviderValid = useProviderPrefixes ? providerPrefixes.some(prefix => modelId.startsWith(prefix.toLowerCase())) : true;

    if (isProviderValid) {
      const isServerless = await isServerlessModel(model);
      if (isServerless) {
        serverlessModels.push(simplifyModelData(model)); // Use simplified model data
      }
    }
  }

  if (serverlessModels.length > 0) {
    // Sort the models alphabetically by modelId
    serverlessModels.sort((a, b) => a.modelId.localeCompare(b.modelId));

    try {
      fs.writeFileSync(CACHE_FILE, JSON.stringify(serverlessModels, null, 2));
      console.log("Serverless models cached successfully.");
      generateStats(serverlessModels); // Generate stats after caching
    } catch (error) {
      console.error("Error writing cache file:", error);
    }
  }

  return serverlessModels;
}

// Trigger fetch only if cache doesn't exist or on command to refresh
fetchServerlessModels().then((serverlessModels) => {
  if (serverlessModels.length === 0) {
    console.log("No serverless models found.");
  } else {
    console.log("Serverless Models:", serverlessModels.length ? serverlessModels : "None found");
  }
});

Returns 38 results (318 without prefix/provider filter; sorted):

Serverless Models: [
  { modelId: 'deepseek-ai/deepseek-coder-33b-instruct' },
  { modelId: 'deepseek-ai/deepseek-coder-7b-instruct-v1.5' },
  { modelId: 'deepseek-ai/deepseek-math-7b-instruct' },
  { modelId: 'dnotitia/Llama-DNA-1.0-8B-Instruct' },
  { modelId: 'meta-llama/CodeLlama-13b-Instruct-hf' },
  { modelId: 'meta-llama/CodeLlama-34b-Instruct-hf' },
  { modelId: 'meta-llama/CodeLlama-70b-Instruct-hf' },
  { modelId: 'meta-llama/CodeLlama-7b-Instruct-hf' },
  { modelId: 'meta-llama/Llama-3.1-405B-Instruct' },
  { modelId: 'meta-llama/Llama-3.1-70B-Instruct' },
  { modelId: 'meta-llama/Llama-3.1-8B-Instruct' },
  { modelId: 'meta-llama/Llama-3.3-70B-Instruct' },
  { modelId: 'meta-llama/Meta-Llama-3-70B-Instruct' },
  { modelId: 'meta-llama/Meta-Llama-3-8B-Instruct' },
  { modelId: 'mistralai/Mistral-7B-Instruct-v0.1' },
  { modelId: 'mistralai/Mistral-7B-Instruct-v0.2' },
  { modelId: 'mistralai/Mistral-7B-Instruct-v0.3' },
  { modelId: 'mistralai/Mixtral-8x22B-Instruct-v0.1' },
  { modelId: 'mistralai/Mixtral-8x7B-Instruct-v0.1' },
  { modelId: 'Qwen/Qwen2-57B-A14B-Instruct' },
  { modelId: 'Qwen/Qwen2-72B-Instruct' },
  { modelId: 'Qwen/Qwen2-7B-Instruct' },
  { modelId: 'Qwen/Qwen2-Math-72B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-14B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-32B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-72B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-7B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-Coder-14B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-Coder-32B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-Coder-7B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-Math-72B-Instruct' },
  { modelId: 'Qwen/Qwen2.5-Math-7B-Instruct' },
  { modelId: 'tiiuae/falcon-40b-instruct' },
  { modelId: 'tiiuae/falcon-7b-instruct' },
  { modelId: 'tiiuae/falcon-mamba-7b-instruct' },
  { modelId: 'tiiuae/Falcon3-10B-Instruct' },
  { modelId: 'tiiuae/Falcon3-7B-Instruct' },
  { modelId: 'tiiuae/Falcon3-Mamba-7B-Instruct' }
]

P.S. I did try parsing each site for the “Serverless” dropdown button, but it proved to be much slower. The method I did here takes <1 second and after the initial cache, doesn’t need to run unless triggered. So, after thinking about this, it could be executed in the build step and just cached for no performance penalty.

Maybe also filter by date and only keep those after a certain one or say updated within a year.

ColeMedin · December 19, 2024, 8:44pm

I really like your suggestions @aliasfox! What do you guys think @thecodacus / @dustinwloring1988?

aliasfox · December 19, 2024, 8:53pm

@ColeMedin, @thecodacus and I have been going back and forth on this. He came up with this in PR#832 and I’m testing it now. He’s pulling from the API directly, and the caveat there is it requires a key (so has to be done in future steps, after a key is added unless we get creative). And this doesn’t address the mess that is HuggingFace… they have so many duplicates (forks), quantized models, etc. What we want is a clean list of only their models that have there “Interface API (serverless)” enabled.

Not only does that reduce the list down from over 1 million to < 50, but the ones that don’t have that enabled will not work anyways.

It’s something I wanted for a while. We will figure this out! lol

P.S. Thinking about it though, if it only pulls lists you enable (that’s good) but then when compiling for production (Cloudflare Pages, etc.) it will only include the providers you setup… but this could lead to unexpected behavior (like if someone enables a provider later, but you can always enable and rebuild).

dustinwloring1988 · December 19, 2024, 8:53pm

I have a PR that might address some of what you are saying above from a couple days ago feat: data tab by dustinwloring1988 · Pull Request #803 · stackblitz-labs/bolt.diy · GitHub.

thecodacus · December 19, 2024, 8:55pm

regarding the apikeys I still like the idea of having it close to the chat window. as this can a frequently accessed option.

there is a good PR from community which is working on improving the user feedback

regarding the huggingface, like the idea of compiling the script on build time or on repo maybe… not sure we still want to make it as part of CI… or just run the script and get the output into the repo manually as static list,
maybe something to discuss

thecodacus · December 19, 2024, 8:59pm

in future the best option would be to give user an option to enter the model name, as an option. then we can give user freedom to enter any hugginface model

aliasfox · December 21, 2024, 5:42pm

I was thinking about this a little and I don’t think that would either be the best or most intuitive method. I mean, it would probably be nice to start with a blank slate in settings (just provide local and a few main providers) and be able to add either more providers (with “Browse Providers to Add” button) or models (“+ Add Model” button), including maybe a “custom/other” one, with ability to search because the list will only grow.

I think the ultimate idea would be for that “tool” to be able to pull from the “live” list and allow you to add them from there. I do imagine, if you provide a new/custom one, as long as it meets all criteria (icon, name, endpoint, valid response check, description, and not a duplicate endpoint regardless of “name”) maybe it gets added to the “experimental” list that say at least 5 people would need to confirm. And I know this might just be wishful thinking at this point as something like this is low on the docket, might require more of an official discussion, and whatnot… but just my two cents.

It would honestly be nice to maintain a list of providers by the community, so anyone can add new ones

thecodacus · December 21, 2024, 6:49pm

adding provider dynamically is tricky as each provider has different way of handling response unless they are following openAI format.
i was thinking of adding a button for openAILike providers as these are straight forward but for other types it cannot be add with just a button for now

aliasfox · December 21, 2024, 6:50pm

Agreed. Might need to step up a prototype on this and see where it goes.