Scraper Spider

188. HN Ask HN: How to serve inference as we do with containes with cached token

The user from a private education group is investigating efficient methods for serving model inference using containers that cache tokens, leveraging the vLLM framework. They have access to multiple GPUs but prefer not to allocate individual GPUs per user or engage in training models. Their existing setup successfully runs a local Qwen model on a single server; however, they aim to enhance this by implementing key-value (KV) caches within vLLM. The primary goal is to achieve a solution that is both simple and secure, ensuring there is no data leakage between different user sessions. This pursuit involves maintaining the efficiency of inference processes while safeguarding user data integrity across concurrent interactions with the model. Keywords: #phi4, Ask HN, GPUs, KV caches, Qwen, cached token, containers, data leakage, data leakage Keywords: Ask HN, inference, models, private education group, research team, server, session security, vLLM

qwen

news.ycombinator.com a day ago

286. HN Stop Making Models Smarter

The author discusses a preference for "dumber" AI models, such as Composer 1.5, despite their need for detailed guidance and reliance on web searches due to limited knowledge. These simpler models are perceived to have fewer biases compared to advanced ones like Claude Opus 4.6, which excels at processing complex requests with minimal input through a method known as "one-shotting." While the author appreciates that dumber models require less caution in use because of their straightforwardness, they acknowledge that smarter models may need additional controls to prevent overconfidence and hasty conclusions. The text concludes with an interest from the author in hearing about others' experiences with different AI models, highlighting a consideration of both advantages and limitations inherent in these technologies. Keywords: #phi4, Claude Opus, Composer, Dadaist frogs, Qwen, betting mechanic, conclusions, dumber models, game design, guardrails, guidance, knowledge gap, one-shotting, opinions, overconfident, real work, smartest model, system prompts, tool use, web search

qwen

news.ycombinator.com a day ago

734. HN Show HN: I fine-tuned Qwen 3.5 (0.8B–4B) on a Mac for text-to-SQL – 2B beats 12B

The project showcases how fine-tuning Qwen 3.5 language models (ranging from 0.8B to 4B parameters) for text-to-SQL tasks can be efficiently accomplished using LoRA (Low-Rank Adaptation) on an Apple Silicon Mac, leveraging its unified memory architecture within approximately 15 minutes. Key insights reveal that a medium-sized model with 2 billion parameters outperformed both larger and smaller counterparts in SQL query generation from natural language inputs. The study highlights the superiority of LoRA fine-tuning over simple prompt engineering, significantly boosting the validity of generated SQL queries to 86.5% compared to just 1.5% through prompts alone. This approach underscores resource efficiency by utilizing Apple Silicon’s capabilities without requiring external GPUs, making it feasible on standard Macs. The experimentation was conducted with a synthetic text-to-SQL dataset comprising 5,000 examples and utilized specific hyperparameters for quick iteration, such as learning rate adjustments and iteration counts. The project structure is comprehensive, featuring scripts for data preparation, training, evaluation, and model fusion, along with organized directories for datasets and results. Despite its exploratory nature and limitations—such as reliance on a single dataset, fixed hyperparameters, and restricted testing scenarios—the demonstration achieved competitive semantic accuracy when compared to more resource-intensive models or those using full fine-tuning techniques. This work illustrates the potential of localized, minimal-resource model adaptation for specialized tasks like text-to-SQL, demonstrating that LoRA can be effectively applied in consumer-grade hardware environments. Keywords: #phi4, Adapter Weights, Apple Silicon, Dataset, Evaluation Metrics, Execution Accuracy, Fine-tuning, HuggingFace, Hyperparameters, Learning ProjectKeywords: Fine-tuning, LoRA, Loss Monitoring, MLX, Mac, Model Size, Natural Language, Prompt Engineering, Python, Qwen35, SQL Queries, Semantic Accuracy, Synthetic Data, Text Completion, Text-to-SQL, Training Iterations, Unified Memory, uv sync

qwen

github.com 3 days ago

937. HN OpenCode Benchmark Dashboard – compare different LLM providers / quants / models

The OpenCode Benchmark Dashboard is a sophisticated tool crafted to aid developers in evaluating and comparing the performance of large language models (LLMs) on their hardware. Its primary function is to facilitate testing between local and remote LLMs, emphasizing both accuracy and speed through dynamic visual representations that extend beyond conventional metrics such as tokens per second. The dashboard introduces significant metrics like "useful tokens" to provide a more precise measure of performance in practical scenarios. Key features of the OpenCode Benchmark Dashboard include extensive testing capabilities, an intuitive user interface, and the flexibility to assess models based on specific applications, including coding or data extraction tasks. Notably, the tool reveals that smaller quantized models, such as Qwen 3.5 with 35 billion parameters, can surpass larger models in terms of accuracy. Additionally, it is observed that remote models frequently outperform their local counterparts. This tool proves invaluable for optimizing LLM performance across diverse hardware configurations and aids developers in selecting the most suitable model by conducting tests and reviewing outcomes via an interactive dashboard interface. The installation process requires setting up necessary dependencies like the Bun runtime environment and configuring models on a local basis. Keywords: #phi4, Benchmark Dashboard, Bun runtime, CPU-only systems, GPT OSS, LLMs, Nemotron Nano, OpenCode, Qwen, accuracy, data extraction, hardware setup, interactive dashboard, local models, model comparison, performance metrics, problem-solving capability, quantized models, remote models, speed, tokens per second, useful tokens

qwen

grigio.org 4 days ago

952. HN Something is afoot in the land of Qwen

The resignation of Junyang Lin and several key researchers from Alibaba's Qwen team has sparked concerns regarding the future of their open weight models following an internal reorganization at Alibaba. This restructuring led to the appointment of a new leader from Google's Gemini team, prompting an emergency meeting presided over by CEO Wu Yongming due to its perceived importance. Recently released Qwen 3.5 has garnered acclaim for its exceptional performance and scalability across various model sizes, highlighting its prominence in the AI sector. The departures pose a risk to future developments unless Alibaba can effectively retain or replace this talent. Industry observers are optimistic that these core team members will either establish a new enterprise or join other research labs, continuing their innovative contributions to the field of artificial intelligence. Keywords: #phi4, AI models, Alibaba, Binyuan Hui, Bowen Yu, CEO Wu Yongming, Junyang Lin, Kaixin Li, Qwen, Qwen 35, Tongyi Lab, coding tasks, departure, emergency meeting, multi-modal model, open weight models, re-org, research team, researchers, resignation, technology industry

qwen

  simonwillison.net 4 days ago
   https://news.ycombinator.com/item?id=47246746   4 days ago
   https://news.ycombinator.com/item?id=47249343#47249782   4 days ago
   https://openrouter.ai/qwen/qwen3.5-27b   4 days ago
   https://pi.dev   4 days ago
   https://huggingface.co/Qwen/Qwen3.5-35B-A3B/discus   4 days ago
   https://www.reddit.com/r/LocalLLaMA/comments/   4 days ago
   https://insights.som.yale.edu/insights/yale-study-finds   4 days ago
   https://huggingface.co/models?other=qwen3_5&sort=least_p   4 days ago
   https://zed.dev/agentic   4 days ago
   https://apnews.com/article/immigration-raid-hyundai-kor   4 days ago
   https://www.koreatimes.co.kr/foreignaffairs/20251112&#x   4 days ago
   https://www.pbs.org/newshour/nation/attorney-says-   4 days ago
   https://www.brookings.edu/articles/macroeconomic-implic   4 days ago
   https://reclaimthenet.org/china-man-chair-interrogation-soci   4 days ago
   https://news.ycombinator.com/item?id=47252833   4 days ago
   https://status.claude.com/   4 days ago
   https://huggingface.co/Qwen/Qwen3.5-27B   4 days ago
   https://www.migrationpolicy.org/article/biden-deportati   4 days ago
   https://www.theguardian.com/us-news/2025/dec/   4 days ago
   https://www.theguardian.com/us-news/2026/jan/   4 days ago
   https://www.pbs.org/newshour/nation/a-u-s-citizen-   4 days ago
   https://www.propublica.org/article/immigration-dhs-amer   4 days ago
   https://en.wikipedia.org/wiki/Windrush_scandal   4 days ago
   https://imar.ro/~mbuliga/ai-talks.html   4 days ago
   https://github.com/anthropics/claude-code/releases   2 days ago
   https://xkcd.com/1172   2 days ago
   https://www.cato.org/blog/5-ice-detainees-have-violent-   2 days ago
   https://www.nbcnews.com/data-graphics/us-immigration-tr   2 days ago
   https://humanrightsfirst.org/yunseo-chung-v-trump-administra   2 days ago
   https://status.claude.com/incidents/kyj825w6vxr8   2 days ago

1005. HN Did Alibaba just kneecap its powerful Qwen AI team?

Alibaba's AI research team has faced significant challenges due to the departure of key leaders like technical architect Junyang "Justin" Lin following the release of its acclaimed open-source generative model, Qwen3.5. This model was notably praised by figures such as Elon Musk for its efficiency and intelligence density. The exits coincide with a strategic pivot within Alibaba towards monetization under new leadership, potentially compromising its commitment to open-source projects that have previously drawn interest from enterprise users and developers. A reorganization has placed AI initiatives under the "Qwen C-end Business Group," indicating a shift from research-driven goals to commercially-oriented objectives, mirroring trends observed in other tech companies like Meta. Industry experts express concern over future versions of Qwen possibly being restricted behind paid APIs as Alibaba seeks to enhance its cloud service metrics. This potential change urges enterprises reliant on current open-source resources to secure them promptly. The loss of Lin is particularly felt within the community, as he played a crucial role in integrating Eastern engineering expertise with Western open-source practices. As Alibaba approaches its fiscal earnings report, uncertainty looms about whether Qwen will maintain its position as a global AI leader or be absorbed into broader corporate financial strategies. Keywords: #phi4, Alibaba, Alibaba Cloud, Apache 20, DingTalk, Gated DeltaNet, Gemini-fication, Hao Zhou, Junyang Lin, Qwen AI, commercial scale, generative models, intelligence density, open source

qwen

  venturebeat.com 4 days ago
   https://news.ycombinator.com/item?id=47236390   4 days ago
   https://tongyi.aliyun.com/   4 days ago

1105. HN Qwen 3.5: best open-weight vision models, now on live video at 200ms

Qwen 3.5, introduced by The Overshoot Blog, represents a notable development among open-weight vision models due to its ability to process live video with an impressive latency of only 200 milliseconds. This enhancement underscores substantial progress in the field of real-time video processing, positioning Qwen 3.5 as one of the leading models capable of such rapid performance. The model's capability to efficiently handle live video feeds suggests it could play a critical role in applications that require immediate analysis and response, demonstrating a significant step forward in technology designed for dynamic and instantaneous visual data interpretation. Keywords: #phi4, 200ms, Overshoot Blog, Qwen, live video, open-weight, relevant, technical, vision models

qwen

blog.overshoot.ai 5 days ago

1120. HN Show HN: Qwen 3.5 running on a $300 Android phone – on-device, open source

Off Grid is an innovative open-source AI suite for Android and iOS devices that offers extensive offline capabilities without the need for internet connectivity or data uploads. It was released as "Qwen 3.5 Small" and is designed to run efficiently on mid-range devices priced between $200-300, although performance varies with device hardware, particularly optimized for flagship models. The suite includes a variety of AI functionalities: text generation using models like Qwen 3 and Llama 3.2; image generation featuring real-time preview through Stable Diffusion; vision AI to analyze scenes or documents via the camera; built-in tools such as web search and calculator accessible through function calling; voice input with on-device transcription powered by Whisper; and document analysis for various file types including PDFs, code files, and CSVs. Installation of Off Grid can be accomplished via app stores or by building from source, which requires specific development tools like Node.js and Xcode. The application is rigorously tested across platforms to ensure reliable functionality. It garners significant community engagement on Slack and invites contributions to the project. The positive reception is evident in its popularity, with over 780 GitHub stars and approximately 2,000 downloads. Off Grid leverages established open-source projects such as llama.cpp and whisper.cpp, enhancing its feature set while prioritizing user privacy through offline processing. Keywords: #phi4, AI, Android, App Store, Core ML, Document Analysis, GitHub, Image Generation, Jest, Local LLM, Maestro, PDF Extraction, Play Store, Qwen, React Native, Snapdragon, Stable Diffusion, Text Generation, Vision AI, Voice Transcription, Whisper, XCTest, llamacpp, whispercpp

qwen

github.com 5 days ago
https://github.com/alichherawalla/off-grid-mobile-ai&#x 5 days ago

1142. HN Qwen Tech Lead Steps Down

Qwen has announced the resignation of its technology lead, marking a significant change within the company's leadership. Concurrently, there is an important technical advisory regarding website functionality; users are required to enable JavaScript on x.com for optimal site performance. The announcement suggests using a supported browser and directs users to consult their Help Center for further details. These two points together reflect both internal organizational changes at Qwen and external technical requirements necessary for user engagement with the company's digital platforms. Keywords: #phi4, Browser, Continue, Detected, Disabled, Enable, Help Center, JavaScript, List, Qwen Tech Lead, Relevant, Relevant Keywords: Qwen, Steps Down, Supported Browsers, Switch, Tech Lead, Technical Keywords, xcom

qwen

twitter.com 5 days ago

1152. HN Qwen Lead "Forced Out"

The snippet from Reddit features a headline stating that "Qwen Lead 'Forced Out,'" suggesting an event involving someone named Qwen who has been ousted from a leadership role. Despite being labeled as the front page of the internet, the snippet offers no additional information or context regarding the circumstances surrounding this occurrence. There are no details on why Qwen was forced out or what specific situation led to this outcome, leaving readers with an incomplete understanding of the event and its implications. Keywords: #phi4, Forced, Forced Out Keywords: Reddit, Lead, Out, Qwen, Qwen Lead, Reddit, front page, internet

qwen

old.reddit.com 5 days ago
https://xcancel.com/kxli_2000/status/2028880971945 5 days ago

1203. HN Migrating Elderly Care AI from Qwen 3 to 3.5 on Apple Silicon – 14x Latency Fix

The migration of Elderly Care AI systems from Qwen 3 to the more advanced Qwen 3.5 on Apple Silicon involved transitioning from using the llama.cpp inference framework to leveraging Apple's MLX, which is optimized through Metal-native technology for improved throughput. A significant insight during this process was that Qwen 3.5 functions as a vision-language model requiring specialized handling via the `mlx-vlm` library due to its unique architecture comprising a vision tower. An optimization enhancement was achieved by modifying the default thinking mode in the chat template, which effectively reduced latency for text-only interactions prevalent in therapeutic dialogues. Benchmarking tests demonstrated that Qwen 3.5 powered by MLX on port 8018 significantly outperformed llama.cpp on port 8017, showcasing a threefold improvement in mean latency and a 3.6 times enhancement in p95 latency. This performance boost was accompanied by a slight elevation in quality scores due to differences in Metal implementation. While these advancements were promising for non-crisis interactions, with response times comfortably within target limits of 7–10 seconds, the concurrency model posed challenges. Unlike the parallel processing capabilities of llama-server, `mlx-vlm` processes requests sequentially on a single thread, raising concerns about potential bottlenecks when managing multiple residents from one device. This highlighted the need for further research into effectively handling high concurrency to maintain optimal performance without degradation, even with up to 250 residents being served concurrently. Keywords: #phi4, Apple Silicon, Benchmark, Concurrency Model, DeltaNet Architecture, Elderly Care AI, Generation Thread, Holistic Quality, LLM Generation, Latency Fix, MLX Framework, Mean Latency, Metal-native, Qwen 35, Safety Paths, Serial Processing, Therapeutic Intent, Thinking Mode Patch, Unified Memory Architecture, Vision-Language Model, llamacpp, mlx-vlm

qwen

medium.com 5 days ago

1259. HN Qwen 3.5: small models with impressive performance

The text discusses "Qwen 3.5," which are small models recognized for their notable performance capabilities. However, users encounter difficulties due to JavaScript being disabled in their browsers when attempting to access the platform at x.com. To resolve this issue and gain full functionality on the site, it is essential to enable JavaScript or switch to a browser that supports it. Additionally, users seeking further assistance can refer to the Help Center for a list of compatible browsers. The guidance ensures users can seamlessly navigate and utilize Qwen 3.5's features by addressing technical requirements related to browser settings. Keywords: #phi4, Help Center, JavaScript, Qwen, browser, detected, disabled, enable, models, performance, supported, switch, technical, xcom

qwen

twitter.com 5 days ago

1322. HN Ask HN: What Online LLM / Chat do you use?

The discussion on Hacker News revolves around a query concerning alternative platforms for large language models (LLMs) beyond well-known ones such as Anthropic, Grok, ChatGPT, and Qwen. The user expresses an interest in discovering other LLM chat sites to expand their options. This inquiry highlights the growing demand for diverse tools within the field of artificial intelligence, particularly those that offer varying features or experiences compared to mainstream platforms. By seeking recommendations beyond the popular choices, users are indicating a desire to explore new functionalities and innovations in AI-driven conversational interfaces, potentially leading to more tailored or specialized applications. Keywords: #phi4, Anthropic, Ask HN, Chat, ChatGPT, Grok, LLMs, More, More Keywords: Ask HN, Online LLM, Qwen, Recommend, Sites, Try

qwen

news.ycombinator.com 6 days ago
https://help.kagi.com/kagi/ai/assistant.html#avail 5 days ago

1383. HN Qwen 3.5 9B, 4B models beating 30B, 80B models

Qwen 3.5 models (9B and 4B versions) demonstrate superior performance compared to their larger counterparts (30B and 80B) across various benchmarks. These models are part of the Qwen series, accessible through multiple platforms like Hugging Face Transformers, vLLM, SGLang, and KTransformers. The key advancements in Qwen 3.5 include a Unified Vision-Language Foundation that integrates multimodal tokens for tasks involving reasoning, coding, agents, and visual understanding. An Efficient Hybrid Architecture leveraging Gated Delta Networks and sparse Mixture-of-Experts enhances high-throughput inference while reducing latency and costs. Additionally, Scalable Reinforcement Learning Generalization ensures robust adaptability across diverse real-world scenarios by training in environments with complex task distributions. Qwen 3.5 also offers Global Linguistic Coverage, supporting 201 languages to facilitate global deployment with cultural and regional awareness. Its Next-Generation Training Infrastructure increases multimodal training efficiency compared to text-only models through asynchronous reinforcement learning frameworks. The benchmark results underscore Qwen 3.5’s proficiency in language modeling, vision-language tasks, reasoning, coding, multilingualism, and specialized domains such as STEM, puzzles, medical VQA, and video understanding. For deployment, Qwen 3.5 can be accessed via APIs using inference frameworks like SGLang, vLLM, KTransformers, and Hugging Face Transformers. It is recommended to maintain a context length of at least 128K tokens for complex tasks while optimizing performance through specific sampling parameters suited to different task types. Best practices include adjusting settings such as presence penalty and output length to enhance the model's efficiency and accuracy. Overall, the Qwen series provides robust tools designed to help developers and enterprises leverage advanced AI capabilities effectively. Keywords: #phi4, Hugging Face Transformers, Qwen35, RoPE techniques, YaRN scaling, agent applications, architecture efficiency, benchmark results, best practices, causal language model, context length, inference frameworks, linguistic coverage, models, multimodal learning, reinforcement learning, sampling parameters, tool calling, training infrastructure, ultra-long texts, vision encoder

qwen

huggingface.co 6 days ago

1459. HN The Qwen 3.5 Small Model Series

Users attempting to access the Qwen 3.5 Small Model Series page encounter an issue due to JavaScript being disabled in their browsers. The error prevents access and prompts users to resolve this by enabling JavaScript or switching to a browser that supports it. For detailed instructions on how to enable JavaScript, users are directed to consult the Help Center, which provides the necessary guidance to regain site functionality. Keywords: #phi4, Help Center, JavaScript, Qwen, browser, detected, disable, enabled, model, series, supported, switch, technical, technical Keywords: Qwen, xcom

qwen

twitter.com 6 days ago

1477. HN Qwen3.5 Small: 0.8B, 2B, 4B, 9B Released

Qwen3.5 introduces a new model family from Qwen with two distinct variations tailored to different use cases. The first variation, Qwen3.5 Small, is designed for more compact applications and includes models with configurations of 0.8B, 2B, 4B, and 9B parameters, catering to users seeking efficient performance at a smaller scale. In contrast, the second variation, Qwen3.5 Medium, provides larger-scale options with model sizes ranging from 35B-A3B, 27B, 122B-A10B, up to an extensive 397B-A17B configuration, intended for applications requiring greater capacity and complexity in data processing. This bifurcation allows users to select models based on their specific requirements, balancing between computational efficiency and model capability. Keywords: #phi4, 08B, 122B-A10B, 27B, 2B, 35B-A3B, 397B-A17B, 4B, 9B, Medium, Qwen, Released, Small, model family

qwen

  huggingface.co 6 days ago
   https://news.ycombinator.com/item?id=47217305   6 days ago
   https://www.reddit.com/r/LocalLLaMA/comments/   6 days ago

1523. HN Qwen 3.5: 9B, 4B, 2B, 0.8B

The text details the "Qwen3.5" AI model series from Hugging Face, tailored specifically for image-to-text tasks, with varying parameter sizes ranging from 0.8 billion to 403 billion. These models include multiple versions such as Qwen3.5-9B and Qwen3.5-4B, all of which have been recently updated within days or hours, highlighting the dynamic development in this area. Beyond these, the text mentions related collections like "Qwen3-Coder-Next" and the "Qwen2.5" series, indicating a broader suite of AI solutions available. Hugging Face also offers additional resources such as datasets, community support, and enterprise-level applications, which are integrated into their platform. The collection's popularity is evident from its high upvote counts, suggesting significant user engagement. Furthermore, the platform provides an organized interface that allows users to explore these models effectively, view recent updates, and access comprehensive documentation or guidance, enhancing usability for both novice and advanced users in AI exploration. Keywords: #phi4, Collections, Community, Datasets, Docs, Enterprise, Hugging Face, Image-Text-to-Text, Models, Pricing, Qwen35, Spaces, Updated, Versions

qwen

huggingface.co 6 days ago

2669. HN Chinese Propaganda in Open Source AI: Moxie Marlinspike's Confer

The text is a multifaceted discussion on the intersection of technology, privacy, and ethics, articulated by Tommaso Gagliardoni. It critiques Chinese influence in open-source AI models, exemplified by Infomaniak's Euria, which are suspected of being biased due to reliance on technologies like Qwen. The author underscores a lack of transparency that could lead to user manipulation, advocating for ethical and transparent practices in the West. In contrast, Moxie Marlinspike’s Confer is mentioned as a more secure yet imperfect alternative. Gagliardoni also introduces his consultancy firm, "Lucumo Security," focusing on areas like cryptographic engineering, highlighting personal frustrations with opaque conference submission processes. He critiques the EU's proposed Digital Omnibus for potentially weakening GDPR protections and expresses concerns about "ChatControl" – a proposal for client-side scanning that threatens privacy despite not breaking encryption. The author criticizes perceptual hashing and ML classifiers in security due to their unreliability, particularly pointing out how these can be misused by malicious actors. He opposes initiatives like ChatControl for compromising end-to-end (E2E) encryption and has actively reached out to government officials to advocate against such measures. On quantum computing, the text challenges conventional progress metrics based on factorization records, arguing that advancements should instead focus on developing error-corrected qubits. In discussing Web3 technologies, Gagliardoni warns of their dual potential for enhancing privacy while also facilitating illicit activities like ransomware and money laundering, criticizing oversimplified regulatory solutions like backdoors. The narrative then transitions to fintech in the Web3 space, emphasizing the need for compliance mechanisms that balance privacy with societal auditability. It suggests five properties – decentralization, accountability, warranty, opt-in, and programmability – as essential for compliant systems, drawing from research at Horizen Labs. Gagliardoni recounts the mysterious disappearance of cybersecurity professor Xiaofeng Wang and his wife, exploring possible espionage or political motivations and reflecting on broader societal issues like racial profiling in America. In cryptography, he notes NIST's selection of hash-based cryptography (HQC) for quantum resistance due to its performance benefits over lattice-based methods. Further technological observations include criticism of cashless restaurant policies reliant on QR codes, privacy concerns associated with instant messaging platforms, and personal updates on career transitions. The author concludes with reflections on the DEF CON and Black Hat conferences, noting logistical issues in recent events. Overall, this comprehensive text serves as a call to action for nuanced technological policy-making, advocating for transparency, ethical AI development, robust data protection regulations, and balanced approaches to privacy and compliance in emerging technologies. Keywords: #phi4, AI Censorship, AML, Academic Integrity, Accountability, Anonymity, Anonymous Payments, Authorities, Backdoors, Bias, ChatControl, China, Chinese Propaganda, Classic McEliece, Client-Side Scanning, Code-Based KEM, Compliance, Cryptanalysis, Cryptocurrency, Cryptography, Cybersecurity, Cypherpunk, DEF CON, Data Poisoning, Data Protection, Decentralized Finance, Declassification, Digital Sovereignty, E2EE Encryption, EU Antitrust, Encryption, Espionage, Euria, European Models, FBI, False Negatives, False Positives, Feedback, Fintech, Fully Homomorphic Encryption, GDPR, HQC, Horizen Labs, Infomaniak, Instant Messaging, Jabber, KYC, Kleptographic Attacks, LUKS, Libertarianism, Logical Qubits, ML Classifiers, MPC, Malicious Adversary, Matrix, Mistral, Model Extraction, Moxie Marlinspike, Multi-Party Computation, Non-Abelian Anyons, Open Source AI, Opt-In, P2P, Peer Review, Perceptual Hashing, Post-Quantum Cryptography, Programmability, Quantum Computing, Quantum Security, Qwen, Regulation, Security Flaws, Semi-Anonymous, Signal, Surveillance Capitalism, TEE Technology, Taler, Topological QC, Transaction, Transparency, Warrant, Web3, Web3 Conferences, Whisper, XMPP, Zero-Knowledge Proofs

qwen

gagliardoni.net 11 days ago

2813. HN Qwen 3.5 small models out

The document offers a comprehensive overview of Qwen 3.5 models, which are advanced language and vision-language foundation models designed for integration with frameworks such as Hugging Face Transformers and KTransformers. These models enhance performance through multimodal learning, architectural efficiency using Gated Delta Networks and sparse Mixture-of-Experts, scalable reinforcement learning across complex environments, and extensive linguistic coverage encompassing 201 languages. They maintain parity with earlier Qwen models in various benchmarks while introducing a unified vision-language foundation approach. Key features of the Qwen3.5 models include their efficient hybrid architecture that optimizes inference processes, improved scalability in reinforcement learning for handling intricate tasks, and support for a wide range of global languages to facilitate inclusive deployment. The document details how these models can be effectively deployed via APIs using frameworks like SGLang, vLLM, and KTransformers, with specific recommendations on sampling parameters and context lengths to enhance performance. Additionally, the document provides benchmark results that highlight Qwen3.5's superior capabilities in language processing, coding, reasoning, and vision-language tasks. For practical application, it includes quickstart guides, API usage examples, and best practices for integrating these models into systems while achieving optimal performance. The document concludes by encouraging users to cite their work if they find the models beneficial. Keywords: #phi4, API integration, Alibaba Cloud Model Studio, Hugging Face Transformers, Qwen35, RoPE techniques, YaRN scaling, agent applications, architecture efficiency, benchmark results, causal language model, context length, inference frameworks, linguistic coverage, models, multimodal learning, reinforcement learning, sampling parameters, tool calling, training infrastructure, vision encoder

qwen

huggingface.co 12 days ago

2843. HN Qwen 3.5 Medium Model Series

The Qwen 3.5 Medium Model Series necessitates the use of JavaScript for its functionality, but has identified that it is currently disabled in the user's browser, preventing access to the platform at x.com. Users are advised to enable JavaScript or switch to a compatible browser to ensure seamless operation of the service. For guidance on which browsers support this requirement, users can refer to the Help Center where a list of supported browsers is available. This ensures that all users have the necessary tools and information to access and use the platform effectively. Keywords: #phi4, Help Center, JavaScript, Medium, browser, detected, disabled, enable, model, series, supported, switch, xcom

qwen

twitter.com 12 days ago

ScraperSpider

Scraper
Spider